**Springer Proceedings in Mathematics & Statistics**

Kathrin Glau Zorana Grbac Matthias Scherer Rudi Zagst Editors

# Innovations in Derivatives Markets

Fixed Income Modeling, Valuation Adjustments, Risk Management, and Regulation

# Springer Proceedings in Mathematics & Statistics

Volume 165

# Springer Proceedings in Mathematics & Statistics

This book series features volumes composed of selected contributions from workshops and conferences in all areas of current research in mathematics and statistics, including operation research and optimization. In addition to an overall evaluation of the interest, scientific quality, and timeliness of each proposal at the hands of the publisher, individual contributions are all refereed to the high quality standards of leading journals in the field. Thus, this series provides the research community with well-edited, authoritative reports on developments in the most exciting areas of mathematical and statistical research today.

More information about this series at http://www.springer.com/series/10533

Kathrin Glau • Zorana Grbac Matthias Scherer • Rudi Zagst Editors

# Innovations in Derivatives Markets

Fixed Income Modeling, Valuation Adjustments, Risk Management, and Regulation

Editors Kathrin Glau Lehrstuhl für Finanzmathematik Technische Universität München Garching-Hochbrück Germany

Zorana Grbac LPMA Université Paris–Diderot (Paris 7) Paris Cedex 13 France

Matthias Scherer Lehrstuhl für Finanzmathematik Technische Universität München Garching-Hochbrück Germany

Rudi Zagst Lehrstuhl für Finanzmathematik Technische Universität München Garching-Hochbrück Germany

ISSN 2194-1009 ISSN 2194-1017 (electronic) Springer Proceedings in Mathematics & Statistics ISBN 978-3-319-33445-5 ISBN 978-3-319-33446-2 (eBook) DOI 10.1007/978-3-319-33446-2

Library of Congress Control Number: 2016939102

© The Editor(s) (if applicable) and The Author(s) 2016. This book is published open access. Open Access This book is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, duplication, adaptation, distribution, and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, a link is provided to the Creative Commons license, and any changes made are indicated.

The images or other third party material in this book are included in the work's Creative Commons license, unless indicated otherwise in the credit line; if such material is not included in the work's Creative Commons license and the respective action is not permitted by statutory regulation, users will need to obtain permission from the license holder to duplicate, adapt or reproduce the material.

The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.

The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made.

Printed on acid-free paper

This Springer imprint is published by Springer Nature The registered company is Springer International Publishing AG Switzerland The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

# Preface

The financial crisis of 2007–2009 swallowed billions of dollars and caused many corporate defaults. Massive monetary intervention by the US and European central bank stabilized the global financial system, but the long-term consequences of this low interest rate/high government debt policy remain unclear. To avoid such crises scenarios in the future, better regulation was called for by many politicians. The market for portfolio credit derivatives has almost dried out in the aftermath of the crisis and has only recently recovered. Banks are not considered default free anymore, their CDS spreads can tell the story. This has major consequences for OTC derivative transactions between banks and their clients, since the risk of a counterparty credit default cannot be neglected anymore. Concerning interest rates, it has become unclear if there are risk-free rates at all, and if so, how these should be modeled. On top, we have observed negative interest rates for government bonds of countries like Switzerland, Germany, and the US—a feature not captured by many stochastic models.

The conference Innovations in Derivatives Markets—Fixed income modeling, valuation adjustments, risk management, and regulation, March 30–April 1, 2015 at the Technical University of Munich shed some light on the tremendous changes in the financial system. We gratefully acknowledge the support by the KPMG Center of Excellence in Risk Management, which allowed us to bring together leading experts from fixed income markets, credit modeling, banking, and financial engineering. We thank the contributing authors to this volume for presenting the state of the art in postcrisis financial modeling and computational tools. Their contributions reflect the enormous efforts academia and the financial industry have invested in adapting to a new reality.

The financial crisis made evident that changes in risk attitude are imperative. It is therefore fortunate that postcrisis mathematical finance has immediately accepted to go the path of critically reflecting its old paradigms, identifying new rules, and, finally, implementing the necessary changes. This renewal process has led to a paradigm shift characterized by a changed attitude toward—and a reappraisal of liquidity and counterparty risk. We are happy that we can invite the reader to gather insight on these changes, to learn which assumptions to trust and which ones to replace, as well as to enter into the discussion on how to overcome the current difficulties of a practical implementation.

Among others, the plenary speakers Damiano Brigo, Stéphane Crépey, Ernst Eberlein, John Hull, Wolfgang Runggaldier, Luis Seco, and Wim Schoutens are represented with articles in this book. The process of identifying and incorporating key features of financial assets and underlying risks is still in progress, as the reader can discover in the form of a vital panel discussion that complements the scientific contributions of the book. The book is divided into three parts. First, the vast field of counterparty credit risk is discussed. Second, FX markets and particularly multi-curve interest-rate models are investigated. The third part contains innovations in financial engineering with diverse topics from dependence modeling, measuring basis spreads, to innovative fee structures for hedge funds.

We thank the KPMG Center of Excellence in Risk Management for the opportunity to publish this proceedings volume with online open access and acknowledge the fruitful collaboration with Franz Lorenz, Matthias Mayer, and Daniel Sommer. Moreover, we thank all speakers, visitors, the participants of the panel discussion— Damiano Brigo, Christian Fries, John Hull, Daniel Sommer, and Ralf Werner—and the local supporting team—Bettina Haas, Mirco Mahlstedt, Steffen Schenk, and Thorsten Schulz—who helped to make this conference a success.

Garching-Hochbrück, Germany Kathrin Glau Paris Cedex 13, France Zorana Grbac Garching-Hochbrück, Germany Matthias Scherer Garching-Hochbrück, Germany Rudi Zagst

# Foreword

The conference "Innovations in Derivatives Markets—Fixed Income Modelling, Valuation Adjustments, Risk Management, and Regulation" was held on the campus of Technical University of Munich in Garching-Hochbrück (Munich) from March 30, until April 1, 2015. Thanks to the great efforts of the organizers, the scientific committee, the keynote speakers, contributors, and all other participants, the conference was a huge success, bringing together academics and practitioners to learn about and to discuss state-of-the-art derivatives valuation and mathematical finance. More than 200 participants (35 % of whom were academics, 60 % practitioners, and 5 % students) had many fruitful discussions and exchanges during three days of talks.

The conference "Innovations in Derivatives Markets" and this book are part of an initiative that was founded in 2012 as a cooperation between the Chair of Mathematical Finance at the Technical University of Munich and KPMG AG Wirtschaftsprüfungsgesellschaft. This cooperation is based on three pillars: first strengthening a scientifically challenging education of students that at the same time addresses real-world topics, second supporting research with particular focus on young researchers, and third, bringing together academic researchers with practitioners from the financial industry in the areas of trading, treasury, financial engineering, risk management, and risk controlling in order to develop trendsetting and viable improvements in the effective management of financial risks.

The main focus of the conference was the topic of derivatives valuation which is a subject of great importance for the financial industry, specifically the rise of new valuation adjustments commonly referred to as "XVAs". These XVAs have gained significant attention ever since the financial crisis in 2008 when banks suffered tremendous losses due to counterparty credit risk reflected in derivatives valuation via the credit valuation adjustment, CVA. A debate on the incorporation of funding costs in derivatives valuation starting in 2012 introduced a new letter to the XVA alphabet, the so-called funding valuation adjustment, FVA, together with its specific impact in banks' profit and loss statements. With the conference, we intended to discuss these topics in the light of market evolutions, regulatory change, and state-of-the-art research in financial mathematics by bringing together renowned scientists, practitioners, and ambitious young researchers.

Over the first two days of the conference, several keynote speeches and invited talks addressed various aspects of derivatives valuation. Topics included the fundamental change of moving from one interest rate curve to a multiple curve environment, taking into account counterparty credit risk and funding into derivatives valuation, new approaches to modeling negative interest rates, and also presenting other advances in mathematical finance. The panel discussion on the first day brought together the views of renowned representatives from academia and the financial industry on the necessity, reasonableness and, to some extent, the future of derivatives valuation and XVAs. The conference was rounded out by a day of contributed talks, giving young researchers the opportunity to present and discuss their results in front of a broad audience. All in all, the topics presented during the conference covered a large spectrum, ranging from market developments and the management of derivative valuation adjustments to theoretical advances in financial mathematics.

We would like to thank everyone who contributed to make this event a great success. In particular, we express our gratitude to the scientific committee, namely Kathrin Glau, Zorana Grbac, Matthias Scherer, and Rudi Zagst, the organizational team, namely Kathrin Glau, Bettina Haas, Mirco Mahlstedt, Matthias Scherer, Steffen Schenk, Thorsten Schulz, and Rudi Zagst, the keynote speakers, the moderator and participants of the panel discussion, all speakers of invited and contributed talks, and, last but not least, all participants that attended the conference.

We are convinced that this book will help you to gain insights about state-of-the-art research in the area of mathematical finance and to broaden your horizon on the use of mathematical concepts to the fields of derivatives valuation and risk management.

> Dr. Matthias Mayer Dr. Daniel Sommer Franz Lorenz KPMG AG Wirtschaftsprüfungsgesellschaft

# Contents

#### Part I Valuation Adjustments


x Contents


# **Part I Valuation Adjustments**

# **Nonlinearity Valuation Adjustment**

# **Nonlinear Valuation Under Collateralization, Credit Risk, and Funding Costs**

**Damiano Brigo, Qing D. Liu, Andrea Pallavicini and David Sloth**

**Abstract** We develop a consistent, arbitrage-free framework for valuing derivative trades with collateral, counterparty credit risk, and funding costs. Credit, debit, liquidity, and funding valuation adjustments (CVA, DVA, LVA, and FVA) are simply introduced as modifications to the payout cash flows of the trade position. The framework is flexible enough to accommodate actual trading complexities such as asymmetric collateral and funding rates, replacement close-out, and re-hypothecation of posted collateral—all aspects which are often neglected. The generalized valuation equation takes the form of a forward–backward SDE or semi-linear PDE. Nevertheless, it may be recast as a set of iterative equations which can be efficiently solved by our proposed least-squares Monte Carlo algorithm. We implement numerically the case of an equity option and show how its valuation changes when including the above effects. In the paper we also discuss the financial impact of the proposed valuation framework and of nonlinearity more generally. This is fourfold: First, the valuation equation is only based on observable market rates, leaving the value of a derivatives transaction invariant to any theoretical risk-free rate. Secondly, the presence of funding costs makes the valuation problem a highly recursive and nonlinear one. Thus, credit and funding risks are non-separable in general, and despite common practice in banks, CVA, DVA, and FVA cannot be treated as purely additive adjustments without running the risk of double counting. To quantify the valuation error that can be attributed to double counting, we introduce a "nonlinearity valuation adjustment" (NVA) and show that its magnitude can be significant under asymmetric funding rates and replacement close-out at default. Thirdly, as trading

D. Brigo (B) · Q.D. Liu · A. Pallavicini

Department of Mathematics, Imperial College London, London, UK e-mail: damiano.brigo@imperial.ac.uk

Q.D. Liu e-mail: daphne.q.liu@gmail.com

A. Pallavicini Banca IMI, largo Mattioli 3, Milan 20121, Italy e-mail: andrea.pallavicini@imperial.ac.uk

D. Sloth

Rate Options & Inflation Trading, Danske Bank, Copenhagen, Denmark e-mail: dap@danskebank.com

parties cannot observe each others' liquidity policies nor their respective funding costs, the bilateral nature of a derivative price breaks down. The value of a trade to a counterparty will not be just the opposite of the value seen by the bank. Finally, valuation becomes aggregation-dependent and portfolio values cannot simply be added up. This has operational consequences for banks, calling for a holistic, consistent approach across trading desks and asset classes.

**Keywords** Nonlinear valuation ·Nonlinear valuation adjustment NVA·Credit risk · Credit valuation adjustment CVA · Funding costs · Funding valuation adjustment FVA · Consistent valuation · Collateral

# **1 Introduction**

Recent years have seen an unprecedented interest among banks in understanding the risks and associated costs of running a derivatives business. The financial crisis in 2007–2008 made banks painfully aware that derivative transactions involve a number of risks, e.g., credit or liquidity risks that they had previously overlooked or simply ignored. The industry practice for dealing with these issues comes in the form of a series of price adjustments to the classic, risk-neutral price definition of a contingent claim, often coined under mysteriously sounding acronyms such as CVA, DVA, or FVA.<sup>1</sup> The credit valuation adjustment (CVA) corrects the price for the expected costs to the dealer due to the possibility that the counterparty may default, while the so-called debit valuation adjustment (DVA) is a correction for the expected benefits to the dealer due to his own default risk. Dealers also make adjustments due to the costs of funding the trade. This practice is known as a liquidity and funding valuation adjustment (LVA, FVA). Recent headlines such as J.P.Morgan taking a hit of \$1.5 billion in its 2013 fourth-quarter earnings due to funding valuation adjustments underscores the sheer importance of accounting for FVA.

In this paper we develop an arbitrage-free valuation approach of collateralized as well as uncollateralized trades that consistently accounts for credit risk, collateral, and funding costs. We derive a general valuation equation where CVA, DVA, collateral, and funding costs are introduced simply as modifications of payout cash flows. This approach can also be tailored to address trading through a central clearing house (CCP) with initial and variation margins as investigated in Brigo and Pallavicini [6]. In addition, our valuation approach does not put any restrictions on the banks' liquidity policies and hedging strategies, while accommodating asymmetric collateral and funding rates, collateral rehypothecation, and risk-free/replacement close-out conventions. We present an invariance theorem showing that our valuation equa-

<sup>1</sup>Recently, a new adjustment, the so-called KVA or capital valuation adjustment, has been proposed to account for the capital cost of a derivatives transaction (see e.g. Green et al. [26]). Following the financial crisis, banks are faced by more severe capital requirements and leverage constraints put forth by the Basel Committee and local authorities. Despite being a key issue for the industry, we will not consider costs of capital in this paper.

tions do not depend on some unobservable risk-free rates; valuation is purely based on observable market rates. The invariance theorem has appeared first implicitly in Pallavicini et al. [33], and is studied in detail in Brigo et al. [15], a version of which is in this same volume.

Several studies have analyzed the various valuation adjustments separately, but few have tried to build a valuation approach that consistently takes collateralization, counterparty credit risk, and funding costs into account. Under unilateral default risk, i.e., when only one party is defaultable, Brigo and Masetti [4] consider valuation of derivatives with CVA, while particular applications of their approach are given in Brigo and Pallavicini [5], Brigo and Chourdakis [3], and Brigo et al. [8]; see Brigo et al. [11] for a summary. Bilateral default risk appears in Bielecki and Rutkowski [1], Brigo and Capponi [2], Brigo et al. [9] and Gregory [27] who price both the CVA and DVA of a derivatives deal. The impact of collateralization on default risk has been investigated in Cherubini [20] and more recently in Brigo et al. [7, 12]. Assuming no default risk, Piterbarg [36] provides an initial analysis of collateralization and funding risk in a stylized Black–Scholes economy. Morini and Prampolini [31], Fries [25] and Castagna [19] consider basic implications of funding in presence of default risk. However, the most comprehensive attempts to develop a consistent valuation framework are those of Burgard and Kjaer [16, 17], Crépey [21–23], Crépey et al. [24], Pallavicini et al. [33, 34], and Brigo et al. [13, 14].

We follow the works of Pallavicini et al. [34], Brigo et al. [13, 14], and Sloth [37] and consider a general valuation framework that fully and consistently accounts for collateralization, counterparty credit risk, and funding risk when pricing a derivatives trade.We find that the precise patterns of funding-adjusted values depend on a number of factors, including the asymmetry between borrowing and lending rates. Moreover, the introduction of funding risk creates a highly recursive and nonlinear valuation problem. The inherent nonlinearity manifests itself in the valuation equations by taking the form of semi-linear PDEs or BSDEs.

Thus, valuation under funding risk poses a computationally challenging problem; funding and credit costs do not split up in a purely additive way. A consequence of this is that valuation becomes aggregation-dependent. Portfolio values do not simply add up, making it difficult for banks to create CVA and FVA desks with separate and clearcut responsibilities. Nevertheless, banks often make such simplifying assumptions when accounting for the various price adjustments. This can be done, however, only at the expense of tolerating some degree of double counting in the different valuation adjustments.

We introduce the concept of nonlinearity valuation adjustment (NVA) to quantify the valuation error that one makes when treating CVA, DVA, and FVA as separate, additive terms. In particular, we examine the financial error of neglecting nonlinearities such as asymmetric borrowing and lending funding rates and by substituting replacement close-out at default by the more stylized risk-free close-out assumption. We analyze the large scale implications of nonlinearity of the valuation equations: non-separability of risks, aggregation dependence in valuation, and local valuation measures as opposed to universal ones. Finally, our numerical results confirm that NVA and asymmetric funding rates can have a non-trivial impact on the valuation of financial derivatives.

To summarize, the financial implications of our valuation framework are fourfold:


The above points stress the fact that we are dealing with values rather than prices. By this, we mean to distinguish between the unique *price* of an asset in a complete market with a traded risk-free bank account and the *value* a bank or market participant attributes to the particular asset. Nevertheless, in the following, we will use the terms price and value interchangeably to mean the latter. The paper is organized as follows. Section 2 describes the general valuation framework with collateralized credit, debit, liquidity, and funding valuation adjustments. Section 3 derives an iterative solution of the pricing equation as well as a continuous-time approximation. Section 4 introduces the nonlinearity valuation adjustment and provides numerical results for specific valuation examples. Finally, Sect. 5 concludes the paper.

# **2 Trading Under Collateralization, Close-Out Netting, and Funding Risk**

In this section we develop a general risk-neutral valuation framework for OTC derivative deals. The section clarifies how the traditional pre-crisis derivative price is consistently adjusted to reflect the new market realities of collateralization, counterparty credit risk, and funding risk. We refer to the two parties of a credit-risky deal as the investor or dealer ("I") on one side and the counterparty or client ("C") on the other.

We now introduce the mathematical framework we will use. We point out that the focus here is not on mathematics but on building the valuation framework. Full mathematical subtleties are left for other papers and may motivate slightly different versions of the cash flows, see for example Brigo et al. [15]. More details on the origins of the cash flows used here are in Pallavicini et al. [33, 34].

Fixing the time horizon *<sup>T</sup>* <sup>∈</sup> <sup>R</sup><sup>+</sup> of the deal, we define our risk-neutral valuation model on the probability space (Ω, *<sup>G</sup>* , (*Gt*)*<sup>t</sup>*∈[0,*<sup>T</sup>* ], <sup>Q</sup>). <sup>Q</sup> is the risk-neutral probability measure ideally associated with the locally risk-free bank account numeraire growing at the risk-free rate *r*. The filtration (*Gt*)*<sup>t</sup>*∈[0,*<sup>T</sup>* ] models the flow of information of the whole market, including credit, such that the default times of the investor τ*<sup>I</sup>* and the counterparty τ*<sup>C</sup>* are *G* -stopping times. We adopt the notational convention that E*<sup>t</sup>* is the risk-neutral expectation conditional on the information *G<sup>t</sup>* . Moreover, we exclude the possibility of simultaneous defaults for simplicity and define the time of the first default event among the two parties as the stopping time

$$
\mathfrak{r} \triangleq (\mathfrak{r}\_I \wedge \mathfrak{r}\_C).
$$

In the sequel we adopt the view of the investor and consider the cash flows and consequences of the deal from her perspective. In other words, when we price the deal we obtain the value of the position to the investor. As we will see, with funding risk this price will not be the value of the deal to the counterparty with opposite sign, in general.

The gist of the valuation framework is conceptually simple and rests neatly on the classical finance disciplines of risk-neutral valuation and discounting cash flows. When a dealer enters into a derivatives deal with a client, a number of cash flows are exchanged, and just like valuation of any other financial claim, discounting these cash in- or outflows gives us a price of the deal. Post-crisis market practice includes four (or more) different types of cash flow streams occurring once a trading position has been entered: (i) Cash flows coming directly from the derivatives contract, such as payoffs, coupons, dividends, etc.We denote byπ(*t*, *T* )the sum of the discounted cash flows happening over the time period (*t*, *T* ] without including any credit, collateral, and funding effects. This is where classical derivatives valuation would usually stop and the price of a derivative contract with maturity *T* would be given by

$$V\_t = \mathbb{E}\_t \left[ \pi \left( t, T \right) \right].$$

This price assumes no credit risk of the parties involved and no funding risk of the trade. However, present-day market practice requires the price to be adjusted by taking further cash-flow transactions into account: (ii) Cash flows required by collateral margining. If the deal is collateralized, cash flows happen in order to maintain a collateral account that in the case of default will be used to cover any losses. γ (*t*, *T* ;*C*) is the sum of the discounted margining costs over the period (*t*, *T* ] with *C* denoting the collateral account. (iii) Cash flows exchanged once a default event has occurred. We let θτ (*C*, ε) denote the on-default cash-flow with ε being the residual value of the claim traded at default. Lastly, (iv) cash flows required for funding the deal. We denote the sum of the discounted funding costs over the period (*t*, *T* ] by ϕ(*t*, *T* ; *F*) with *F* being the cash account needed for funding the deal. Collecting the terms we obtain a consistent price *V*¯ of a derivative deal taking into account counterparty credit risk, margining costs, and funding costs

$$
\begin{split}
\bar{V}\_{l}(C,F) &= \mathbb{E}\_{l}\left[\pi(t,T\wedge\tau) + \gamma\left(t,T\wedge\tau;C\right) + \varphi(t,T\wedge\tau;F) \\ &\quad + \mathbf{1}\_{\{t<\tau$$

where *<sup>D</sup>*(*t*,τ) <sup>=</sup> exp(<sup>−</sup> <sup>τ</sup> *<sup>t</sup> rsds*) is the risk-free discount factor.

By using a risk-neutral valuation approach, we see that only the payout needs to be adjusted under counterparty credit and funding risk. In the following paragraphs we expand the terms of (1) and carefully discuss how to compute them.

# *2.1 Collateralization*

The ISDA master agreement is the most commonly used framework for full and flexible documentation of OTC derivative transactions and is published by the International Swaps and Derivatives Association (ISDA [29]). Once agreed between two parties, the master agreement sets out standard terms that apply to all deals entered into between those parties. The ISDA master agreement lists two tools to mitigate counterparty credit risk: *collateralization* and *close-out netting*. Collateralization of a deal means that the party which is out-of-the-money is required to post collateral usually cash, government securities, or highly rated bonds—corresponding to the amount payable by that party in the case of a default event. The credit support annex (CSA) to the ISDA master agreement defines the rules under which the collateral is posted or transferred between counterparties. Close-out netting means that in the case of default, all transactions with the counterparty under the ISDA master agreement are consolidated into a single net obligation which then forms the basis for any recovery settlements.

Collateralization of a deal usually happens according to a margining procedure. Such a procedure involves that both parties post collateral amounts to or withdraw amounts from the collateral account *C* according to their current exposure on prefixed dates {*t*1,...,*tn* = *T* } during the life of the deal, typically daily. Let α*<sup>i</sup>* be the year fraction between *ti* and *ti*+1. The terms of the margining procedure may, furthermore, include independent amounts, minimum transfer amounts, thresholds, etc., as described in Brigo et al. [7]. However, here we adopt a general description of the margining procedure that does not rely on the particular terms chosen by the parties.

We consider a collateral account *C* held by the investor. Moreover, we assume that the investor is the collateral taker when *Ct* > 0 and the collateral provider when *Ct* < 0. The CSA ensures that the collateral taker remunerates the account *C* at an accrual rate. If the investor is the collateral taker, he remunerates the collateral account by the accrual rate *c*<sup>+</sup> *<sup>t</sup>* (*T* ), while if he is the collateral provider, the counterparty remunerates the account at the rate *c*<sup>−</sup> *<sup>t</sup>* (*T* ). <sup>2</sup> The effective accrual collateral rate *c*˜*t*(*T* ) is defined as

$$
\tilde{c}\_t(T) \stackrel{\triangle}{=} c\_t^-(T)\mathbf{1}\_{\{C\_t < 0\}} + c\_t^+(T)\mathbf{1}\_{\{C\_t > 0\}}.\tag{2}
$$

<sup>2</sup>We stress the slight abuse of notation here: A plus and minus sign does not indicate that the rates are positive or negative parts of some other rate, but instead it tells which rate is used to accrue interest on the collateral according to the sign of the collateral account.

Nonlinearity Valuation Adjustment 9

More generally, to understand the cash flows originating from collateralization of the deal, let us consider the consequences of the margining procedure to the investor. At the first margin date, say *t*1, the investor opens the account and posts collateral if he is out-of-the-money, i.e. if *Ct*<sup>1</sup> < 0, which means that the counterparty is the collateral taker. On each of the following margin dates*tk* , the investor posts collateral according to his exposure as long as*Ctk* < 0. As collateral taker, the counterparty pays interest on the collateral at the accrual rate *c*<sup>−</sup> *tk* (*tk*+1) between the following margin dates *tk* and *tk*+1. We assume that interest accrued on the collateral is saved into the account and thereby directly included in the margining procedure and the close-out. Finally, if*Ctn* < 0 on the last margin date *tn*, the investor closes the collateral account, given no default event has occurred in between. Similarly, for positive values of the collateral account, the investor is instead the collateral taker and the counterparty faces corresponding cash flows at each margin date. If we sum up all the discounted margining cash flows of the investor and the counterparty, we obtain

$$\mathbb{V}\left(t, T \wedge \tau; C\right) \stackrel{\Delta}{=} \sum\_{k=1}^{n-1} \mathbf{1}\_{\{t \le t < (T \wedge \tau)\}} D(t, t\_k) C\_{t\_k} \left(1 - \frac{P\_{t\_k}(t\_{k+1})}{P\_{t\_k}^{\tilde{c}}(t\_{k+1})}\right),\tag{3}$$

with the zero-coupon bond *Pc*˜ *<sup>t</sup>* (*T* ) - [1 + (*T* − *t*)*c*˜*t*(*T* )] <sup>−</sup>1, and the risk-free zero coupon bond, related to the risk-free rate *r*, given by *Pt*(*T* ). If we adopt a first order expansion (for small *c* and *r*), we can approximate

$$\gamma\left(t, T \wedge \tau; C\right) \approx \sum\_{k=1}^{n-1} \mathbf{1}\_{\{t \le t\_k < (T \wedge \tau)\}} D(t, t\_k) C\_{\hbar} a\_k \left(r\_{\hbar}(t\_{k+1}) - \tilde{c}\_{\hbar}(t\_{k+1})\right), \quad (4)$$

where with a slight abuse of notation we call *c*˜*t*(*T* ) and *rt*(*T* ) the continuously (as opposed to simple) compounded interest rates associated with the bonds *Pc*˜ and *P*. This last expression clearly shows a cost of carry structure for collateral costs. If *C* is positive to "I", then "I" is holding collateral and will have to pay (hence the minus sign) an interest *c*+, while receiving the natural growth *r* for cash, since we are in a risk-neutral world. In the opposite case, if "I" posts collateral, *C* is negative to "I" and "I" receives interest *c*<sup>−</sup> while paying the risk-free rate, as should happen when one shorts cash in a risk-neutral world.

A crucial role in collateral procedures is played by rehypothecation. We discuss rehypothecation and its inherent liquidity risk in the following.

#### **Rehypothecation**

Often the CSA grants the collateral taker relatively unrestricted use of the collateral for his liquidity and trading needs until it is returned to the collateral provider. Effectively, the practice of rehypothecation lowers the costs of remuneration of the provided collateral. However, while without rehypothecation the collateral provider can expect to get any excess collateral returned after honoring the amount payable on the deal, if rehypothecation is allowed the collateral provider runs the risk of losing a fraction or all of the excess collateral in case of default on the collateral taker's part.

We denote the recovery fraction on the rehypothecated collateral by *R <sup>I</sup>* when the investor is the collateral taker and by *R <sup>C</sup>* when the counterparty is the collateral taker. The general recovery fraction on the market value of the deal that the investor receives in the case of default of the counterparty is denoted by *RC*, while *RI* is the recovery fraction received by the counterparty if the investor defaults. The collateral provider typically has precedence over other creditors of the defaulting party in getting back any excess capital, which means *RI R <sup>I</sup>* 1 and *RC R <sup>C</sup>* 1. If no rehypothecation is allowed and the collateral is kept safe in a segregated account, we have that *R <sup>I</sup>* = *R <sup>C</sup>* = 1.

# *2.2 Close-Out Netting*

In case of default, all terminated transactions under the ISDA master agreement with a given counterparty are netted and consolidated into a single claim. This also includes any posted collateral to back the transactions. In this context the close-out amount plays a central role in calculating the on-default cash flows. The close-out amount is the costs or losses that the surviving party incurs when replacing the terminated deal with an economic equivalent. Clearly, the size of these costs will depend on which party survives so we define the close-out amount as

$$x\_{\mathbf{r}} \triangleq \mathbf{1}\_{\{\mathbf{r} = \mathbf{r}\_{\mathcal{C}} < \mathbf{r}\_{I}\}} x\_{I,\mathbf{r}} + \mathbf{1}\_{\{\mathbf{r} = \mathbf{r}\_{I} < \mathbf{r}\_{\mathcal{C}}\}} x\_{\mathcal{C},\mathbf{r}},\tag{5}$$

where ε*<sup>I</sup>*,τ is the close-out amount on the counterparty's default priced at time τ by the investor and ε*<sup>C</sup>*,τ is the close-out amount if the investor defaults. Recall that we always consider the deal from the investor's viewpoint in terms of the sign of the cash flows involved. This means that if the close-out amount ε*<sup>I</sup>*,τ as measured by the investor is positive, the investor is a creditor of the counterpaty, while if it is negative, the investor is a debtor of the counterparty. Analogously, if the close-out amount ε*<sup>C</sup>*,τ to the counterparty but viewed from the investor is positive, the investor is a creditor of the counterparty, and if it is negative, the investor is a debtor to the counterparty.

We note that the ISDA documentation is, in fact, not very specific in terms of how to actually calculate the close-out amount. Since 2009, ISDA has allowed for the possibility to switch from a risk-free close-out rule to a replacement rule that includes the DVA of the surviving party in the recoverable amount. Parker and Mc-Garry[35] and Weeber and Robson [40] show how a wide range of values of the close-out amount can be produced within the terms of ISDA. We refer to Brigo et al. [7] and the references therein for further discussions on these issues. Here, we adopt the approach of Brigo et al. [7] listing the cash flows of all the various scenarios that can occur if default happens. We will net the exposure against the pre-default value of the collateral *C*<sup>τ</sup><sup>−</sup> and treat any remaining collateral as an unsecured claim.

If we aggregate all these cash flows and the pre-default value of collateral account, we reach the following expression for the on-default cash-flow

Nonlinearity Valuation Adjustment 11

$$\begin{split} \theta\_{\mathsf{r}}(\mathcal{C},\boldsymbol{\varepsilon}) \triangleq \begin{split} \mathbf{1}\_{\{\boldsymbol{\varepsilon}=\mathsf{r}\_{\mathsf{C}}<\mathsf{r}\_{\mathsf{I}}\}} & \left( \boldsymbol{\varepsilon}\_{I,\mathsf{r}} - \mathrm{LGD}\_{\mathcal{C}} (\boldsymbol{\varepsilon}\_{I,\mathsf{r}}^{+} - \boldsymbol{\mathsf{C}}\_{\mathsf{r}^{-}}^{+})^{+} - \mathrm{LGD}\_{\mathcal{C}}' (\boldsymbol{\varepsilon}\_{I,\mathsf{r}}^{-} - \boldsymbol{\mathsf{C}}\_{\mathsf{r}^{-}}^{-})^{+} \right) \ (6) \\ & + \mathbf{1}\_{\{\boldsymbol{\varepsilon}=\mathsf{r}\_{\mathsf{I}}<\mathsf{r}\_{\mathsf{C}}\}} \left( \boldsymbol{\varepsilon}\_{\mathsf{C},\mathsf{r}} - \mathrm{LGD}\_{I} (\boldsymbol{\varepsilon}\_{\mathsf{C},\mathsf{r}}^{-} - \boldsymbol{\mathsf{C}}\_{\mathsf{r}^{-}}^{-})^{-} - \mathrm{LGD}\_{I}' (\boldsymbol{\varepsilon}\_{\mathsf{C},\mathsf{r}}^{+} - \boldsymbol{\mathsf{C}}\_{\mathsf{r}^{-}}^{+})^{-} \right) . \end{split}$$

We use the short-hand notation *X* <sup>+</sup> := max(*X* , 0) and *X* <sup>−</sup> := min(*X* , 0), and define the loss-given-default as LGD*<sup>C</sup>* - 1 − *RC*, and the collateral loss-givendefault as LGD *<sup>C</sup>* - 1 − *R <sup>C</sup>*. If both parties agree on the exposure, namely ε*<sup>I</sup>*,τ = ε*<sup>C</sup>*,τ = ετ , when we take the risk-neutral expectation in (1), we see that the price of the discounted on-default cash-flow,

$$\begin{aligned} \mathbb{E}\_t[\mathbf{1}\_{\{t < \tau < T\}} D(t, \tau) \theta\_\tau(C, \varepsilon)] &= \mathbb{E}\_t[\mathbf{1}\_{\{t < \tau < T\}} D(t, \tau) \, \varepsilon\_\tau] \\ &- \mathrm{CVA}(t, T; C) + \mathrm{DVA}(t, T; C), \qquad (7) \end{aligned}$$

is the present value of the close-out amount reduced by the positive collateralized CVA and DVA terms

$$\begin{aligned} \mathcal{H}\_{\text{CVAcoll}}(\mathbf{s}) &= \left( \text{LGD}\_{\text{C}}(\boldsymbol{\varepsilon}\_{I,\mathbf{s}}^{+} - \boldsymbol{C}\_{\text{s}^{-}}^{+})^{+} + \text{LGD}\_{\text{C}}'(\boldsymbol{\varepsilon}\_{I,\mathbf{s}}^{-} - \boldsymbol{C}\_{\text{s}^{-}}^{-})^{+} \right) \geq 0, \\ \mathcal{H}\_{\text{DVAcoll}}(\mathbf{s}) &= -\left( \text{LGD}\_{I}(\boldsymbol{\varepsilon}\_{\mathcal{C},\mathbf{s}}^{-} - \boldsymbol{C}\_{\text{s}^{-}}^{-})^{-} + \text{LGD}\_{I}'(\boldsymbol{\varepsilon}\_{\mathcal{C},\mathbf{s}}^{+} - \boldsymbol{C}\_{\text{s}^{-}}^{+})^{-} \right) \geq 0, \end{aligned}$$

and

$$\text{CVA}(t, T; C) \stackrel{\Delta}{=} \mathbb{E}\_{\mathfrak{t}} \left[ \mathbf{1}\_{\{\tau = \tau\_{C} < T\}} D(t, \tau) \boldsymbol{\varPi} \,\text{CVA}(\tau) \right],$$

$$\text{DVA}(t, T; C) \stackrel{\Delta}{=} \mathbb{E}\_{\mathfrak{t}} \left[ \mathbf{1}\_{\{\tau = \tau\_{I} < T\}} D(t, \tau) \boldsymbol{\varPi} \,\text{DVA}(\tau) \right]. \tag{8}$$

Also, observe that if rehypothecation of the collateral is not allowed, the terms multiplied by LGD *<sup>C</sup>* and LGD *<sup>I</sup>* drop out of the CVA and DVA calculations.

# *2.3 Funding Risk*

The hedging strategy that perfectly replicates the no-arbitrage price of a derivative is formed by a position in cash and a position in a portfolio of hedging instruments. When we talk about a derivative deal's funding, we essentially mean the cash position that is required as part of the hedging strategy, and with funding costs we refer to the costs of maintaining this cash position. If we denote the cash account by *F* and the risky asset account by *H*, we get

$$V\_t = F\_t + H\_t.$$

In the classical Black–Scholes–Merton theory, the risky part *H* of the hedge would be a delta position in the underlying stock, whereas the locally risk-free (cash) part *F* would be a position in the risk-free bank account. If the deal is collateralized, the margining procedure is included in the deal definition insuring that funding of the collateral is automatically taken into account. Moreover, if rehypothecation is allowed for the collateralized deal, the collateral taker can use the posted collateral as a funding source and thereby reduce or maybe even eliminate the costs of funding the deal. Thus, we have the following two definitions of the funding account: If rehypothecation of the posted collateral is allowed,

$$F\_t \stackrel{\Delta}{=} V\_t - C\_t - H\_t,\tag{9}$$

and if such rehypothecation is forbidden, we have

$$F\_t \stackrel{\Delta}{=} V\_t - H\_t. \tag{10}$$

By implication of (9) and (10) it is obvious that if the funding account *Ft* > 0, the dealer needs to borrow cash to establish the hedging strategy at time *t*. Correspondingly, if the funding account *Ft* < 0, the hedging strategy requires the dealer to invest surplus cash. Specifically, we assume the dealer enters a funding position on a discrete time-grid {*t*1,...,*tm*} during the life of the deal. Given two adjacent funding times *tj* and *tj*+1, for 1 ≤ *j* ≤ *m* − 1, the dealer enters a position in cash equal to *Ftj* at time *tj* . At time *tj*+<sup>1</sup> the dealer redeems the position again and either returns the cash to the funder if it was a long cash position and pays funding costs on the borrowed cash, or he gets the cash back if it was a short cash position and receives funding benefits as interest on the invested cash. We assume that these funding costs and benefits are determined at the start date of each funding period and charged at the end of the period.

Let *P* ˜*<sup>f</sup> <sup>t</sup>* (*T* ) represent the price of a borrowing (or lending) contract measurable at *t* where the dealer pays (or receives) one unit of cash at maturity *T* > *t*. We introduce the effective funding rate ˜*ft* as a function: ˜*ft* = *f* (*t*, *F*, *H*,*C*), assuming that it depends on the cash account *Ft* , hedging account *Ht* , and collateral account *Ct* . Moreover, the zero-coupon bond corresponding to the effective funding rate is defined as

$$P\_t^{f}(T) \stackrel{\Delta}{=} [1 + (T - t)\tilde{f}\_t(T)]^{-1},$$

If we assume that the dealer hedges the derivatives position by trading in the spot market of the underlying asset(s), and the hedging strategy is implemented on the same time-grid as the funding procedure of the deal, the sum of discounted cash flows from funding the hedging strategy during the life of the deal is equal to

$$\begin{split} \boldsymbol{\varrho}(t, T \wedge \tau; F, H) \\ = \sum\_{j=1}^{m-1} \mathbf{1}\_{\{t \le t\_j < (T \wedge \tau)\}} D(t, t\_j) \left( F\_{t\_j} - (F\_{t\_j} + H\_{l\_j}) \frac{P\_{t\_j}(t\_{j+1})}{P\_{t\_j}^{\vec{f}}(t\_{j+1})} + H\_{l\_j} \frac{P\_{t\_j}(t\_{j+1})}{P\_{t\_j}^{\vec{f}}(t\_{j+1})} \right) \\ = \sum\_{j=1}^{m-1} \mathbf{1}\_{\{t \le t\_j < (T \wedge \tau)\}} D(t, t\_j) F\_{t\_j} \left( 1 - \frac{P\_{t\_j}(t\_{j+1})}{P\_{t\_j}^{\vec{f}}(t\_{j+1})} \right). \end{split} \tag{11}$$

This is, strictly speaking, a discounted payout and the funding cost or benefit at time *t* is obtained by taking the risk-neutral expectation of the above cash flows. For a trading example giving more details on how the above formula for ϕ originates, see Brigo et al. [15].

As we can see from Eq. (11), the dependence of hedging account dropped off from the funding procedure. For modeling convenience, we can define the effective funding rate ˜*ft* faced by the dealer as

$$f\_t(T) \stackrel{\Delta}{=} f\_t^-(T)\mathbf{1}\_{\{F\_t < 0\}} + f\_t^+(T)\mathbf{1}\_{\{F\_t > 0\}}.\tag{12}$$

A related framework would be to consider the hedging account *H* as being perfectly collateralized and use the collateral to fund hedging, so that there is no funding cost associated with the hedging account.

As with collateral costs mentioned earlier, we may rewrite the cash flows for funding as a first order approximation in continuously compounded rates ˜*f* and *r* associated to the relevant bonds. We obtain

$$\varphi(t, T \wedge \tau; F) \approx \sum\_{j=1}^{m-1} \mathbf{1}\_{\{t \le t\_j < (T \wedge \tau)\}} D(t, t\_j) F\_{t\_j} a\_j \left( r\_{t\_j}(t\_{j+1}) - \tilde{f}\_{t\_j}(t\_{j+1}) \right), \tag{13}$$

We should also mention that, occasionally, we may include the effects of repo markets or stock lending in our framework. In general, we may borrow/lend the cash needed to establish *H* from/to our treasury, and we may then use the risky asset in *H* for repo or stock lending/borrowing in the market. This means that we could include the funding costs and benefits coming from this use of the risky asset. Here, we assume that the bank's treasury automatically recognizes this benefit or cost at the same rate ˜*f* as used for cash, but for a more general analysis involving repo rate *h*˜ please refer to, for example, Pallavicini et al. [34], Brigo et al. [15].

The particular positions entered by the dealer to either borrow or invest cash according to the sign and size of the funding account depend on the bank's liquidity policy. In the following we discuss two possible cases: One where the dealer can fund at rates set by the bank's treasury department, and another where the dealer goes to the market directly and funds his trades at the prevailing market rates. As a result, the funding rates and therefore the funding effect on the price of a derivative deal depends intimately on the chosen liquidity policy.

#### **Treasury Funding**

If the dealer funds the hedge through the bank's treasury department, the treasury determines the funding rates *f* <sup>±</sup> faced by the dealer, often assuming average funding costs and benefits across all deals. This leads to two curves as functions of maturity; one for borrowing funds *f* <sup>+</sup> and one for lending funds *f* <sup>−</sup>. After entering a funding position *Ftj* at time *tj* , the dealer faces the following discounted cash-flow

$$(\Phi\_j(t\_j, t\_{j+1}; F) \stackrel{\triangle}{=} -N\_{t\_j} D(t\_j, t\_{j+1}), \dots$$

.

with

$$N\_{t\_j} \overset{\Delta}{=} \frac{F\_{t\_j}^-}{P\_{t\_j}^{f^-}(t\_{j+1})} + \frac{F\_{t\_j}^+}{P\_{t\_j}^{f^+}(t\_{j+1})}.$$

Under this liquidity policy, the treasury—and not the dealer himself—is in charge of debt valuation adjustments due to funding-related positions. Also, being entities of the same institution, both the dealer and the treasury disappear in case of default of the institution without any further cash flows being exchanged and we can neglect the effects of funding in this case. So, when default risk is considered, this leads to following definition of the funding cash flows

$$
\Phi\_j(t\_j, t\_{j+1}; F) \stackrel{\triangle}{=} \mathbf{1}\_{\{\mathbf{r} > t\_j\}} \Phi\_j(t\_j, t\_{j+1}; F) .
$$

Thus, the risk-neutral price of the cash flows due to the funding positions entered at time *tj* is

$$\mathbb{E}\_{t\_j}[\bar{\Phi}\_j(t\_j, t\_{j+1}; F)] = -\mathbf{1}\_{\{\mathbf{r} > t\_j\}} \left( F\_{t\_j}^- \frac{P\_{t\_j}(t\_{j+1})}{P\_{t\_j}^{f^-}(t\_{j+1})} + F\_{t\_j}^+ \frac{P\_{t\_j}(t\_{j+1})}{P\_{t\_j}^{f^+}(t\_{j+1})} \right)$$

If we consider a sequence of such funding operations at each time *tj* during the life of the deal, we can define the sum of cash flows coming from all the borrowing and lending positions opened by the dealer to hedge the trade up to the first-default event

$$\begin{split} \varphi(t, T \wedge \tau; F) & \stackrel{\Delta}{=} \sum\_{j=1}^{m-1} \mathbf{1}\_{\{t \le t\_{j} < (T \wedge \tau)\}} D(t, t\_{j}) \left( F\_{t\_{j}} + \mathbb{E}\_{t\_{j}} \Big[ \bar{\Phi}\_{j}(t\_{j}, t\_{j+1}; F) \Big] \right) \\ &= \sum\_{j=1}^{m-1} \mathbf{1}\_{\{t \le t\_{j} < (T \wedge \tau)\}} D(t, t\_{j}) \left( F\_{t\_{j}} - F\_{t\_{j}} \frac{P\_{t\_{j}}(t\_{j+1})}{P\_{t\_{j}}^{f^{-}}(t\_{j+1})} - F\_{t\_{j}}^{+} \frac{P\_{t\_{j}}(t\_{j+1})}{P\_{t\_{j}}^{f^{+}}(t\_{j+1})} \right). \end{split} \tag{14}$$

In terms of the effective funding rate, this expression collapses to (11).

#### **Market Funding**

If the dealer funds the hedging strategy in the market—and not through the bank's treasury—the funding rates are determined by prevailing market conditions and are often deal-specific. This means that the rate *f* <sup>+</sup> the dealer can borrow funds at may be different from the rate *f* <sup>−</sup> at which funds can be invested. Moreover, these rates may differ across deals depending on the deals' notional, maturity structures, dealer-client relationship, and so forth. Similar to the liquidity policy of treasury funding, we assume a deal's funding operations are closed down in the case of default. Furthermore, as the dealer now operates directly on the market, he needs to include a DVA due to his funding positions when he marks-to-market his trading books. For simplicity, we assume that the funder in the market is default-free so no funding CVA needs to be accounted for. The discounted cash-flow from the borrowing or lending position between two adjacent funding times *tj* and *tj*+<sup>1</sup> is given by

$$\begin{aligned} \bar{\Phi}\_j(t\_j, t\_{j+1}; F) & \stackrel{\Delta}{=} \mathbf{1}\_{\{\mathbf{r} > t\_j\}} \mathbf{1}\_{\{\mathbf{r}\_l > t\_{j+1}\}} \Phi\_j(t\_j, t\_{j+1}; F) \\ & \qquad - \mathbf{1}\_{\{\mathbf{r} > t\_j\}} \mathbf{1}\_{\{\mathbf{r}\_l < t\_{j+1}\}} (\mathbf{LGD}\_I \mathbf{e}\_{F, \mathbf{r}\_l}^- - \varepsilon\_{F, \mathbf{r}\_l}) D(t\_j, \mathbf{r}\_l), \end{aligned}$$

where ε*<sup>F</sup>*,*<sup>t</sup>* is the close-out amount calculated by the funder on the dealer's default

$$
\varepsilon\_{F, \mathfrak{r}\_I} \triangleq -N\_{t\_j} P\_{\mathfrak{r}\_I}(t\_{j+1}) \dots
$$

To price this funding cash-flow, we take the risk-neutral expectation

$$\mathbb{E}\_{t\_j} \Big[ \bar{\Phi}\_j(t\_j, t\_{j+1}; F) \Big] = -\mathbf{1}\_{\{\mathbf{r} > t\_j\}} \Big( F\_{t\_j}^- \frac{P\_{t\_j}(t\_{j+1})}{P\_{t\_j}^{f^-}(t\_{j+1})} + F\_{t\_j}^+ \frac{P\_{t\_j}(t\_{j+1})}{\bar{P}\_{t\_j}^{f^+}(t\_{j+1})} \Big).$$

Here, the zero-coupon funding bond *P*¯ *<sup>f</sup>* <sup>+</sup> *<sup>t</sup>* (*T* ) for borrowing cash is adjusted for the dealer's credit risk

$$
\bar{P}\_t^{f^+} (T) \triangleq \frac{P\_t^{f^+} (T)}{\mathbb{E}\_t^T \left[ \text{LGD}\_I \mathbf{1}\_{\{t\_I > T\}} + R\_I \right]},
$$

where the expectation on the right-hand side is taken under the *T* -forward measure. Naturally, since the seniority could be different, one might assume a different recovery rate on the funding position than on the derivatives deal itself (see Crépey [21]). Extensions to this case are straightforward.

Next, summing the discounted cash flows from the sequence of funding operations through the life of the deal, we get a new expression for ϕ that is identical to (14) where the *P <sup>f</sup>* <sup>+</sup> *<sup>t</sup>* (*T* ) in the denominator is replaced by *P*¯ *<sup>f</sup>* <sup>+</sup> *<sup>t</sup>* (*T* ). To avoid cumbersome notation, we will not explicitly write *P*¯ *<sup>f</sup>* <sup>+</sup> in the sequel, but just keep in mind that when the dealer funds directly in the market then *P <sup>f</sup>* <sup>+</sup> needs to be adjusted for*funding DVA*. Thus, in terms of the effective funding rate, we obtain (11).

# **3 Generalized Derivatives Valuation**

In the previous section we analyzed the discounted cash flows of a derivatives trade and we developed a framework for consistent valuation of such deals under collateralized counterparty credit and funding risk. The arbitrage-free valuation framework is captured in the following theorem.

**Theorem 1** (Generalized Valuation Equation) *The consistent arbitrage-free price V*¯ *<sup>t</sup>*(*C*, *F*) *of a contingent claim under counterparty credit risk and funding costs takes the form*

$$
\bar{V}\_t(C, F) = \mathbb{E}\_t \left[ \pi(t, T \wedge \tau) + \gamma(t, T \wedge \tau; C) + \phi(t, T \wedge \tau; F) \right. \\
\begin{aligned}
\left. \quad \left(15\right) \\
\left. \quad + \mathbf{1}\_{\{t < \tau < T\}} D(t, \tau) \theta\_\tau(C, \varepsilon) \right],
\end{aligned} \quad (15)
$$

*where*


Note that in general a nonlinear funding rate may lead to arbitrages since the choice of the martingale measure depends on the funding/hedging strategy (see Remark 4.2). One has to be careful in order to guarantee that the relevant valuation equation admits solutions. Existence and uniqueness of solutions in the framework of this paper are discussed from a fully mathematical point of view in Brigo et al. [15], a version of which, from the same authors, appears in this volume.

In general, while the valuation equation is conceptually clear—we simply take the expectation of the sum of all discounted cash flows of the trade under the riskneutral measure—solving the equation poses a recursive, nonlinear problem. The future paths of the effective funding rate ˜*f* depend on the future signs of the funding account *F*, i.e. whether we need to borrow or lend cash on each future funding date. At the same time, through the relations (9) and (10), the future sign and size of the funding account *F* depend on the adjusted price *V*¯ of the deal which is the quantity we are trying to compute in the first place. One crucial implication of this nonlinear structure of the valuation problem is the fact that FVA is generally not just an additive adjustment term, as often assumed. More importantly, we see that the celebrated conjecture identifying the DVA of a deal with its funding benefit is not fully general. Only in the unrealistic setting where the dealer can fund an uncollateralized trade at equal borrowing and lending rates, i.e. *f* <sup>+</sup> = *f* <sup>−</sup>, do we achieve the additive structure often assumed by practitioners. If the trade is collateralized, we need to impose even further restrictions as to how the collateral is linked to the price of the trade *V*¯ . It should be noted here that funding DVA (as referred to in the previous section) is similar to the DVA2 in Hull and White [28] and the concept of "windfall funding benefit at own default" in Crépey [22, 23]. In practice, however, funds transfer pricing and similar operations conducted by banks' treasuries clearly weaken the link between FVA and this source of DVA. The DVA of the funding instruments does not regard the bank's funding positions, but the derivatives position, and in general it does not match the FVA mainly due to the presence of funding netting sets.

#### *Remark 1* (*The Law of One Price*.)

On the theoretical side, the generalized valuation equation shakes the foundation of the celebrated Law of One Price prevailing in classical derivatives pricing. Clearly, if we assume no funding costs, the dealer and counterparty agree on the price of the deal as both parties can—at least theoretically—observe the credit risk of each other through CDS contracts traded in the market and the relevant market risks, thus agreeing on CVA and DVA. In contrast, introducing funding costs, they will not agree on the FVA for the deal due to asymmetric information. The parties cannot observe each others' liquidity policies nor their respective funding costs associated with a particular deal. As a result, the value of a deal position will not generally be the same to the counterparty as to the dealer just with opposite sign.

Finally, as we adopt a risk-neutral valuation framework, we implicitly assume the existence of a risk-free interest rate. Indeed, since the valuation adjustments are included as additional cash flows and not as ad-hoc spreads, all the cash flows in (15) are discounted by the risk-free discount factor *D*(*t*, *T* ). Nevertheless, the risk-free rate is merely an instrumental variable of the general valuation equation. We clearly distinguish market rates from the theoretical risk-free rate avoiding the dubious claim that the over-night rates are risk free. In fact, as we will show in continuous time, if the dealer funds the hedging strategy of the trade through cash accounts available to him—whether as rehypothecated collateral or funds from the treasury, repo market, etc.—the risk-free rate vanishes from the valuation equation.

# *3.1 Discrete-Time Solution*

Our purpose here is to turn the generalized valuation equation (15) into a set of iterative equations that can be solved by least-squares Monte Carlo methods. These methods are already standard in CVA and DVA calculations (Brigo and Pallavicini [5]). To this end, we introduce the auxiliary function

$$\begin{aligned} \bar{\pi}\left(t\_j, t\_{j+1}; C\right) & \stackrel{\Delta}{=} \pi\left(t\_j, t\_{j+1} \wedge \tau\right) + \gamma\left(t\_j, t\_{j+1} \wedge \tau; C\right) \\ & + \mathbf{1}\_{\{t\_j < \tau < t\_{j+1}\}} D(t\_j, \tau) \theta\_\tau(C, \varepsilon) \end{aligned} \tag{16}$$

which defines the cash flows of the deal occurring between time *tj* and *tj*+<sup>1</sup> adjusted for collateral margining costs and default risks. We stress the fact that the closeout amount used for calculating the on-default cash flow still refers to a deal with maturity *T* . If we then solve valuation equation (15) at each funding date *tj* in the time-grid {*t*1,..., *tn* = *T* }, we obtain the deal price *V*¯ at time *tj* as a function of the deal price on the next consecutive funding date *tj*+<sup>1</sup>

$$\begin{split} \bar{\tilde{V}}\_{t\_{j}} &= \mathbb{E}\_{t\_{j}} \Big[ \bar{\tilde{V}}\_{t\_{j+1}} D(t\_{j}, t\_{j+1}) + \bar{\pi} \left( t\_{j}, t\_{j+1}; C \right) \Big] \\ &+ \mathbf{1}\_{\{\tau > t\_{j}\}} \Big( F\_{t\_{j}} - F\_{t\_{j}}^{-} \frac{P\_{t\_{j}}(t\_{j+1})}{P\_{t\_{j}}^{f^{-}}(t\_{j+1})} - F\_{t\_{j}}^{+} \frac{P\_{t\_{j}}(t\_{j+1})}{P\_{t\_{j}}^{f^{+}}(t\_{j+1})} \Big), \end{split}$$

where, by definition, *V*¯ *tn* - 0 on the final date *tn*. Recall the definitions of the funding account in (9) if no rehypothecation of collateral is allowed and in (10) if rehypothecation is permitted, we can then solve the above for the positive and negative parts of the funding account. The outcome of this exercise is a discrete-time iterative solution of the recursive valuation equation, provided in the following theorem.

**Theorem 2** (Discrete-time Solution of the Generalized Valuation Equation) *We may solve the full recursive valuation equation in Theorem 1 as a set of backwarditerative equations on the time-grid* {*t*1,..., *tn* = *T* } *with V*¯ *tn* - 0*. For* τ < *tj , we have*

$$V\_{t\_j} = 0,$$

*while for* τ > *tj , we have*

*(i) if rehypothecation is forbidden:*

$$\left(\bar{V}\_{t\_j} - H\_{t\_j}\right)^{\pm} = P\_{t\_j}^{\vec{f}}(t\_{j+1}) \left(\mathbb{E}\_{t\_j}^{t\_{j+1}} \left[\bar{V}\_{t\_{j+1}} + \frac{\bar{\pi}\left(t\_j, t\_{j+1}; C\right) - H\_{t\_j}}{D(t\_j, t\_{j+1})}\right]\right)^{\pm},$$

*(ii) if rehypothecation is allowed:*

$$\begin{aligned} (\bar{V}\_{t\_j} - \mathbb{C}\_{t\_j} - H\_{t\_j})^\pm \\ &= P\_{t\_j}^{\bar{f}}(t\_{j+1}) \left( \mathbb{E}\_{t\_j}^{t\_{j+1}} \left[ \bar{V}\_{t\_{j+1}} + \frac{\bar{\pi} \left( t\_j, t\_{j+1}; C \right) - C\_{t\_j} - H\_{t\_j}}{D(t\_j, t\_{j+1})} \right] \right)^\pm, \end{aligned}$$

*where the expectations are taken under the* Q*tj*+<sup>1</sup> *-forward measure.*

The ± sign in the theorem is supposed to stress the fact that the sign of the funding account, which determines the effective funding rate, depends on the sign of the conditional expectation. Further intuition may be gained by going to continuous time, which is the case we will now turn to.

# *3.2 Continuous-Time Solution*

Let us consider a continuous-time approximation of the general valuation equation. This implies that collateral margining, funding, and hedging strategies are executed in continuous time. Moreover, we assume that rehypothecation is allowed, but similar results hold if this is not the case. By taking the time limit, we have the following expressions for the discounted cash flow streams of the deal

$$\begin{aligned} \pi(t, T \wedge \tau) &= \int\_{t}^{T \wedge \tau} \pi(s, s + ds) D(t, s), \\ \gamma(t, T \wedge \tau; C) &= \int\_{t}^{T \wedge \tau} (r\_s - \tilde{c}\_s) C\_s D(t, s) ds, \end{aligned}$$

Nonlinearity Valuation Adjustment 19

$$\varphi(t, T \wedge \tau; F) = \int\_{t}^{T \wedge \tau} (r\_s - \tilde{f}\_s) F\_s D(t, s) ds,$$

where as mentioned earlierπ(*t*, *t* + *dt*)is the pay-off coupon process of the derivative contract and *rt* is the risk-free rate. These equations can also be immediately derived by looking at the approximations given in Eqs. (4) and (13).

Then, putting all the above terms together with the on-default cash flow as in Theorem 1, the recursive valuation equation yields

$$\begin{split} \bar{V}\_{t} &= \int\_{t}^{T} \mathbb{E}\_{t} \Big[ \left( \mathbf{1}\_{\{s < \tau\}} \pi(s, s + ds) + \mathbf{1}\_{\{\tau \in ds\}} \theta\_{s}(C, s) \right) D(t, s) \Big] \\ &+ \int\_{t}^{T} \mathbb{E}\_{t} \Big[ \mathbf{1}\_{\{s < \tau\}} (r\_{s} - \tilde{c}\_{s}) C\_{s} D(t, s) \Big] ds \\ &+ \int\_{t}^{T} \mathbb{E}\_{t} \Big[ \mathbf{1}\_{\{s < \tau\}} (r\_{s} - \tilde{f}\_{s}) F\_{s} \Big] D(t, s) ds. \end{split} \tag{17}$$

By recalling Eq. (7), we can write the following

**Proposition 1** *The value V*¯ *<sup>t</sup> of the claim under credit gap risk, collateral, and funding costs can be written as*

$$V\_t = V\_t - C\mathbf{V}\_t + D\mathbf{V}\_t + L\mathbf{V}\_t + F\mathbf{V}\mathbf{A}\_t \tag{18}$$

*where Vt is the price of the deal when there is no credit risk, no collateral, and no funding costs; LVA is a liquidity valuation adjustment accounting for the costs/benefits of collateral margining; FVA is the funding cost/benefit of the deal hedging strategy, and CVA and DVA are the familiar credit and debit valuation adjustments after collateralization. These different adjustments can be obtained by rewriting (17). One gets*

$$V\_t = \int\_t^T \mathbb{E}\_t \left\{ D(t, s) \mathbf{1}\_{\{\mathbf{r} > s\}} \left[ \pi \left( \mathbf{s}, s + ds \right) + \mathbf{1}\_{\{\mathbf{r} \in ds\} \mathcal{E}\_t} \right] \right\} \tag{19}$$

*and the valuation adjustments*

$$\begin{split} \text{CVA}\_{t} &= -\int\_{t}^{T} \mathbb{E}\left[D(t,s)\mathbf{1}\_{\{\mathbf{r}>s\}}\right] - \mathbf{1}\_{\{\mathbf{s}=\mathbf{r}\_{\mathcal{C}}<\mathbf{r}\_{\mathcal{I}}\}} \boldsymbol{H}\_{\text{CVA}coll}(s)\right] du \\ \text{DVA}\_{t} &= \int\_{t}^{T} \mathbb{E}\left\{D(t,s)\mathbf{1}\_{\{\mathbf{r}>s\}}\Big[\mathbf{1}\_{\{\mathbf{s}=\mathbf{r}\_{\mathcal{I}}<\mathbf{r}\_{\mathcal{C}}\}} \boldsymbol{H}\_{\text{DVA}coll}(s)\Big] du \\ \text{LVA}\_{t} &= \int\_{t}^{T} \mathbb{E}\_{t}\Big[D(t,s)\mathbf{1}\_{\{\mathbf{r}>s\}}(r\_{s}-\tilde{c}\_{s})C\_{s}\Big] ds \\ \text{FVA}\_{t} &= \int\_{t}^{T} \mathbb{E}\Big[D(t,s)\mathbf{1}\_{\{\mathbf{r}>s\}}\Big[(r\_{s}-\tilde{f}\_{s})F\_{s}\Big]\Big] ds \end{split}$$

*As usual, CVA and DVA are both positive, while LVA and FVA can be either positive or negative. Notice that if c equals the risk-free rate, LVA vanishes. Similarly, FVA* ˜ *vanishes if the funding rate* ˜*f is equal to the risk-free rate.*

We note that there is no general consensus on our definition of LVA and other authors may define it differently. For instance, Crépey [21–23] refers to LVA as the liquidity component (i.e., net of credit) of the funding valuation adjustment.

We now take a number of heuristic steps. A more formal analysis in terms of FBSDEs or PDEs is, for example, provided in Brigo et al. [15]. For simplicity, we first switch to the default-free market filtration (*Ft*)*<sup>t</sup>*≥0. This step implicitly assumes a separable structure of our complete filtration (*Gt*)*<sup>t</sup>*≥0. We are also assuming that the basic portfolio cash flows π(0, *t*) are *Ft*-measurable and that default times of all parties are conditionally independent, given filtration *F*.

Assuming the relevant technical conditions are satisfied, the Feynman–Kac theorem now allows us to write down the corresponding pre-default partial differential equation (PDE) of the valuation problem (further details may be found in Brigo et al. [13, 14], and Sloth [37]). This PDE could be solved directly as in Crépey [22]. However, if we apply the Feynman–Kac theorem again—this time going from the pre-default PDE to the valuation expectation—and integrate by parts, we arrive at the following result

**Theorem 3** (Continuous-time Solution of the Generalized Valuation Equation) *If we assume collateral rehypothecation and delta-hedging, we can solve the iterative equations of Theorem 2 in continuous time. We obtain*

$$\bar{V}\_t = \int\_t^T \mathbb{E}^{\tilde{f}} \{ D(t, u; \tilde{f} + \lambda) [\pi\_u + \lambda\_u \theta\_u + (\tilde{f}\_u - \tilde{c}\_u) C\_u] | \mathcal{F}\_t \} du \qquad (20)$$

*where* λ*<sup>t</sup> is the first-to-default intensity,* π*<sup>t</sup> dt is shorthand for* π(*t*, *t* + *dt*)*, and the discount factor is defined as D*(*t*,*s*; ξ ) *e*<sup>−</sup> *<sup>s</sup> <sup>t</sup>* <sup>ξ</sup>*<sup>u</sup> du. The expectations are taken under the pricing measure* Q ˜*<sup>f</sup> for which the underlying risk factors grow at the rate* ˜*f when the underlying pays no dividend.*

Theorem 3 decomposes the deal price *V*¯ into three intuitive terms. The first term is the value of the deal cash flows, discounted at the funding rate plus credit. The second term is the price of the on-default cash-flow in excess of the collateral, which includes the CVA and DVA of the deal after collateralization. The last term collects the cost of collateralization. At this point it is very important to appreciate once again that ˜*f* depends on *F*, and hence on *V*.

*Remark 2* (*Deal-dependent Valuation Measure, Local Risk-neutral Measures*). Since the pricing measure depends on ˜*f* which in turn depends on the very value *V*¯ we are trying to compute, we have that the valuation measure becomes deal/portfoliodependent. Claims sharing a common set of hedging instruments can be priced under a common measure.

Finally, we stress once again a very important invariance result that first appeared in Pallavicini et al. [34] and studied in detail in a more mathematical setting in Brigo et al. [15]. The proof is immediate by inspection.

**Theorem 4** (Invariance of the Valuation Equation wrt. the Short Rate *rt*)*. Equation* (20) *for valuation under credit, collateral, and funding costs is completely governed by market rates; there is no dependence on a risk-free rate rt . Whichever initial process is postulated for r, the final price is invariant to it.*

# **4 Nonlinear Valuation: A Numerical Analysis**

This section provides a numerical case study of the valuation framework outlined in the previous sections. We investigate the impact of funding risk on the price of a derivatives trade under default risk and collateralization. Also, we analyze the valuation error of ignoring nonlinearties of the general valuation problem. Specifically, to quantify this error, we introduce the concept of a nonlinearity valuation adjustment (NVA). A generalized least-squares Monte Carlo algorithm is proposed inspired by the simulation methods of Carriere [18], Longstaff and Schwartz [30], Tilley [38], and Tsitsiklis and Van Roy [39] for pricing American-style options. As the purpose is to understand the fundamental implications of funding risk and other nonlinearities, we focus on trading positions in relatively simple derivatives. However, the Monte Carlo method we propose below can be applied to more complex derivative contracts, including derivatives with bilateral payments.

# *4.1 Monte Carlo Pricing*

Recall the recursive structure of the general valuation: The deal price depends on the funding decisions, while the funding strategy depends on the future price itself. The intimate relationship among the key quantities makes the valuation problem computationally challenging.

We consider *K* default scenarios during the life of the deal—either obtained by simulation, bootstrapped from empirical data, or assumed in advance. For each firstto-default time τ corresponding to a default scenario, we compute the price of the deal *V*¯ under collateralization, close-out netting, and funding costs. The first step of our simulation method entails simulating a large number of sample paths *N* of the underlying risk factors *X*. We simulate these paths on the time-grid {*t*1,..., *tm* = *T* ∗} with step size Δ*t* = *tj*+<sup>1</sup> − *tj* from the assumed dynamics of the risk factors. *T* <sup>∗</sup> is equal to the final maturity *T* of the deal or the consecutive time-grid point following the first-default time τ , whichever occurs first. For simplicity, we assume the time periods for funding decisions and collateral margin payments coincide with the simulation time grid.

Given the set of simulated paths, we solve the funding strategy recursively in a dynamic programming fashion. Starting one period before *T* <sup>∗</sup>, we compute for each simulated path the funding decision *F* and the deal price *V*¯ according to the set of backward-inductive equations of Theorem 2. Note that while the reduced formulation of Theorem 3 may look simpler at first sight, avoiding the implicit recursive structure of Theorem 2, it would instead give us a forward–backward SDE problem to solve since the underlying asset now accrues at the funding rate which itself depends on *V*¯ . The algorithm then proceeds recursively until time zero. Ultimately, the total price of the deal is computed as the probability-weighted average of the individual prices obtained in each of the *K* default scenarios.

The conditional expectations in the backward-inductive funding equations are approximated by across-path regressions based on least squares estimation similar to Longstaff and Schwartz [30]. We regress the present value of the deal price at time *tj*+1, the adjusted payout cash flow between *tj* and *tj*+1, the collateral account and funding account at time *tj* on basis functions ψ of realizations of the underlying risk factors at time *tj* across the simulated paths. To keep notation simple, let us assume that we are exposed to only one underlying risk factor, e.g. a stock price. Specifically, the conditional expectations in the iterative equations of Theorem 2, taken under the risk-neutral measure, are equal to

$$\mathbb{E}\_{t\_j} \left[ \Xi\_{t\_j} (\bar{V}\_{t\_{j+1}}) \right] = \theta\_{t\_j}' \,\,\psi (X\_{t\_j}), \tag{21}$$

where we have defined Ξ*tj*(*V*¯ *tj*+<sup>1</sup> ) - *D*(*tj*, *tj*+<sup>1</sup>)*V*¯ *tj*+<sup>1</sup> + ¯π (*tj*, *tj*+1;*C*) − *Ctj* − *Htj* . Note the *Ctj* term drops out if rehypothecation is not allowed. The usual least-squares estimator of θ is then given by

$$\hat{\theta}\_{t\_j} \stackrel{\triangle}{=} \left[ \varphi(X\_{t\_j}) \varphi(X\_{t\_j})' \right]^{-1} \varphi(X\_{t\_j}) \,\,\Xi\_{t\_j}(\bar{V}\_{t\_{j+1}}) . \tag{22}$$

Orthogonal polynomials such as Chebyshev, Hermite, Laguerre, and Legendre may all be used as basis functions for evaluating the conditional expectations. We find, however, that simple power series are quite effective and that the order of the polynomials can be kept relatively small. In fact, linear or quadratic polynomials, i.e. ψ(*Xtj*) = (**1**, *Xtj*, *X*<sup>2</sup> *tj* ) , are often enough.

Further complexities are added, as the dealer may—realistically—decide to hedge the full deal price *V*¯ . Now, the hedge *H* itself depends on the funding strategy through *V*¯ , while the funding decision depends on the hedging strategy. This added recursion requires that we solve the funding and hedging strategies simultaneously. For example, if the dealer applies a delta-hedging strategy we can write, heuristically,

$$H\_{t\_j} = \frac{\partial \bar{V}}{\partial X}\Big|\_{t\_j} X\_{t\_j} \approx \frac{V\_{t\_{j+1}} - (1 + \mathcal{A}t\_j \bar{f}\_{t\_j})V\_{t\_j}}{X\_{t\_{j+1}} - (1 + \mathcal{A}t\_j \tilde{f}\_{t\_j})X\_{t\_j}} X\_{t\_j},\tag{23}$$

and we obtain, in the case of rehypothecation, the following system of nonlinear equations

$$\begin{cases} F\_{t\_j} - \frac{P\_{t\_j}^f(t\_{j+1})}{P\_{t\_j}(t\_{j+1})} \mathbb{E}\_{t\_j} \left[ \boldsymbol{\Xi}\_{t\_j}(\bar{V}\_{t\_{j+1}}) \right] = 0, \\ H\_{t\_j} - \frac{\bar{\mathcal{V}}\_{t\_{j+1}} - (1 + 4t\_j \bar{f}\_{t\_j}) \bar{\mathcal{V}}\_{t\_j}}{X\_{t\_{j+1}} - (1 + 4t\_j \bar{f}\_{t\_j}) X\_{t\_j}} X\_{t\_j} = \mathbf{0}, \\ \bar{V}\_{t\_j} = F\_{t\_j} + C\_{t\_j} + H\_{t\_j}, \end{cases} (24)$$

where all matrix operations are on an element-by-element basis. An analogous result holds when rehypothecation of the posted collateral is forbidden.

Each period and for each simulated path, we find the funding and hedging decisions by solving this system of equations, given the funding and hedging strategies for all future periods until the end of the deal. We apply a simple Newton–Raphson method to solve the system of nonlinear equations numerically, but instead of using the exact Jacobian, we approximate it by finite differences. As initial guess, we use the Black–Scholes delta position

$$H^0\_{t\_\flat} = \mathcal{A}^{\mathcal{B}^S}\_{t\_\flat} X\_{t\_\flat}.$$

The convergence is quite fast and only a small number of iterations are needed in practice. Finally, if the dealer decides to hedge only the risk-free price of the deal, i.e. the classic derivative price *V*, the valuation problem collapses to a much simpler one. The hedge *H* no longer depends on the funding decision and can be computed separately, and the numerical solution of the nonlinear equation system can be avoided altogether.

In the following we apply our valuation framework to the case of a stock or equity index option. Nevertheless, the methodology extends fully to any other derivatives transaction. For instance, applications to interest rate swaps can be found in Pallavicini and Brigo [32] and Brigo and Pallavicini [6].

# *4.2 Case Outline*

Let *St* denote the price of some stock or equity index and assume it evolves according to a geometric Brownian motion *d St* = *r Stdt* + σ *StdWt* where *W* is a standard Brownian motion under the risk-neutral measure. The risk-free interest rate *r* is 100 bps, the volatility σ is 25 %, and the current price of the underlying is *S*<sup>0</sup> = 100. The European call option is in-the-money and has strike *K* = 80. The maturity *T* of the deal is 3 years and, in the full case, we assume that the investor delta-hedges the deal according to (23). The usual default-free funding-free and collateral-free Black–Scholes price *V*<sup>0</sup> of the call option deal is given by

$$V\_t = S\_t \Phi(d\_1(t)) - Ke^{-r(T-t)} \Phi(d\_2(t)), \quad d\_{1,2} = \frac{\ln(S\_t/K) + (r \pm \sigma^2/2)(T-t)}{\sigma \sqrt{T-t}},$$

and for *t* = 0 we get

$$V\_0 = 28.9$$

with our choice of inputs. As usual, Φ is the cumulative distribution function of the standard normal random variable. In the usual setting, the hedge would not be (23) but a classical delta-hedging strategy based on Φ(*d*1(*t*)).

We consider two simple discrete probability distributions of default. Both parties of the deal are considered default risky but can only default at year 1 or at year 2. The localized joint default probabilities are provided in the matrices below. The rows denote the default time of the investor, while the columns denote the default times of the counterparty. For example, in matrix *D*low the event (τ*<sup>I</sup>* = 2*yr*, τ*<sup>C</sup>* = 1*yr*) has a 3% probability and the first-to-default time is 1 year. Simultaneous defaults are introduced as an extension of our previous assumptions, and we determine the closeout amount by a random draw from a uniform distribution. If the random number is above 0.5, we compute the close-out as if the counterparty defaulted first, and vice versa.

For the first default distribution, we have a low dependence between the default risk of the counterparty and the default risk of the investor

$$\begin{array}{cccc} \text{1yr} & \text{2yr} & n.d. \\ \text{1yr} & \begin{pmatrix} 0.01 & 0.01 & 0.03 \\ 0.03 & 0.01 & 0.05 \\ 0.07 & 0.09 & 0.70 \end{pmatrix}, & \tau\_K(D\_{\text{low}}) = 0.21 \end{array} \tag{25}$$

where *n*.*d*. means no default and τ*<sup>K</sup>* denotes the rank correlation as measured by Kendall's tau. In the second case, we have a high dependence between the two parties' default risk

$$\begin{array}{cccc} \text{1yr} & \text{2yr} & n.d. \\ \text{1yr} & \begin{pmatrix} 0.09 & 0.01 & 0.01 \\ 0.03 & 0.11 & 0.01 \\ 0.01 & 0.03 & 0.70 \end{pmatrix}, & \tau\_K(D\_{\text{high}}) = 0.83 \end{array} \tag{26}$$

Note also that the distributions are skewed in the sense that the counterparty has a higher default probability than the investor. The loss, given default, is 50 % for both the investor and the counterparty and the loss on any posted collateral is considered the same. The collateral rates are chosen to be equal to the risk-free rate. We assume that the collateral account is equal to the risk-free price of the deal at each margin date, i.e. *Ct* = *Vt* . This is reasonable as the dealer and client will be able to agree on this price, in contrast to *V*¯ *<sup>t</sup>* due to asymmetric information. Also, choosing the collateral this way has the added advantage that the collateral account *C* works as a control variate, reducing the variance of the least-squares Monte Carlo estimator of the deal price.

# *4.3 Preliminary Valuation Under Symmetric Funding and Without Credit Risk*

To provide some ball-park figures on the effect of funding risk, we first look at the case without default risk and without collateralization of the deal. We compare our Monte Carlo approach to the following two alternative (simplified) approaches:

(a) The Black–Scholes price where both discounting and the growth of the underlying happens at the symmetric funding rate

$$V\_t^{(a)} = \left( S\_t \Phi(g\_1(t)) - K e^{-\hat{f}(T-t)} \Phi(g\_2(t)) \right),$$

$$g\_{1,2} = \frac{\ln(S\_t/K) + (\hat{f} \pm \sigma^2/2)(T-t)}{\sigma \sqrt{T-t}}.$$

(b) We use the above FVA formula in Proposition 1 with some approximations. Since in a standard Black–Scholes setting *Ft* = −*K e*−*r*(*T*−*t*) Φ(*d*2(*t*)), we compute

$$\begin{aligned} \text{FVA}^{(b)} &= (r - \hat{f}) \int\_0^T \mathbb{E}\_0 \left\{ e^{-rs} [F\_s] \right\} ds \\ &= (\hat{f} - r) K e^{-rT} \int\_0^T \mathbb{E}\_0 \left\{ \Phi \left( d\_2(s) \right) \right\} ds \end{aligned}$$

We illustrate the two approaches for a long position in an equity call option. Moreover, let the funding valuation adjustment in each case be defined by FVA(*a*,*b*) <sup>=</sup> *V*(*a*,*b*) − *V*. Figure 1 plots the resulting funding valuation adjustment with credit and collateral switched off under both simplified approaches and under the full valuation approach. Recall that if the funding rate is equal to the risk-free rate, the value of the call option collapses to the Black–Scholes price and the funding valuation adjustment is zero.

#### *Remark 3* (*Current Market Practice for FVA*).

Looking at Fig. 1, it is important to realize that at the time of writing this paper, most market players would adopt a methodology like (a) or (b) for a simple call option. Even if borrowing or lending rates were different, most market players would average them and apply a common rate to borrowing and lending, in order to avoid nonlinearities. We notice that method (b) produces the same results as the quicker method (a) which simply replaces the risk-free rate by the funding rate. In the simple case without credit and collateral, and with symmetric borrowing and lending rates, we can show that this method is sound since it stems directly from (20). We also see that both methods (a) and (b) are quite close to the full numerical method we adopt. Overall both simplified methods (a) and (b) work well here, and there would be no need to implement the full machinery under these simplifying assumptions. However, once collateral, credit, and funding risks are in the picture, we have to abandon approximations like (a) or (b) and implement the full methodology instead.

**Fig. 1** Funding valuation adjustment of a long call position as a function of symmetric funding spreads *sf* := ˆ*f* − *r* with ˆ*f* := *f* <sup>+</sup> = *f* <sup>−</sup>. The adjustments are computed under the assumption of no default risk nor collateralization

# *4.4 Complete Valuation Under Credit Risk, Collateral, and Asymmetric Funding*

Let us now switch on credit risk and consider the impact of asymmetric funding rates. Due the presence of collateral as a control variate, the accuracy is quite good in our example even for relatively small numbers of sample paths. Based on the simulation of 1,000 paths, Tables 1 and 2 report the results of a ceteris paribus analysis of funding risk under counterparty credit risk and collateralization. Specifically, we investigate how the value of a deal changes for different values of the borrowing (lending) rate *f* <sup>+</sup> ( *f* <sup>−</sup>) while keeping the lending (borrowing) rate fixed to 100 bps. When both funding rates are equal to 100 bps, the deal is funded at the risk-free rate and we are in the classical derivatives valuation setting.

#### *Remark 4* (*Potential Arbitrage*).

Note that if *f* <sup>+</sup> < *f* <sup>−</sup> arbitrage opportunities might be present, unless certain constraints are imposed on the funding policy of the treasury. Such constraints may look unrealistic and may be debated themselves from the point of view of arbitrageability, but since our point here is strictly to explore the impact of asymmetries in the funding equations, we will still apply our framework to a few examples where *f* <sup>+</sup> < *f* <sup>−</sup>.

Table 1 reports the impact of changing funding rates for a call position when the posted collateral may not be used for funding the deal, i.e. rehypothecation is not allowed. First, we note that increasing the lending rate for a long position has a much


**Table 1** Price impact of funding with default risk and collateralization

Standard errors of the price estimates are given in parentheses

aCeteris paribus changes in one funding rate while keeping the other fixed to 100 bps

bBased on the joint default distribution *<sup>D</sup>*low with low dependence cBased on the joint default distribution *<sup>D</sup>*high with high dependence



Standard errors of the price estimates are given in parentheses

aCeteris paribus changes in one funding rate while keeping the other fixed to 100 bps

bBased on the joint default distribution *<sup>D</sup>*low with low dependence cBased on the joint default distribution *<sup>D</sup>*high with high dependence

larger impact than increasing the borrowing rate. This is due to the fact that a call option is just a one-sided contract. Recall that *F* is defined as the cash account needed as part of the derivative replication strategy or, analogously, the cash account required to fund the hedged derivative position. To hedge a long call, the investor goes short in a delta position of the underlying asset and invests excess cash in the treasury at *f* <sup>−</sup>. Correspondingly, to hedge the short position, the investor enters a long delta position in the stock and finances it by borrowing cash from the treasury at *f* <sup>+</sup>, so changing the lending rate only has a small effect on the deal value. Finally, due to the presence of collateral, we observe an almost similar price impact of funding under the two different default distributions *D*low and *D*high.

Finally, assuming cash collateral, we consider the case of rehypothecation and allow the investor and counterparty to use any posted collateral as a funding source. If the collateral is posted to the investor, this means it effectively reduces his costs of funding the delta-hedging strategy. As the payoff of the call is one-sided, the investor only receives collateral when he holds a long position in the call option. But as he hedges this position by short-selling the underlying stock and lending the excess cash proceeds, the collateral adds to his cash lending position and increases the funding benefit of the deal. Analogously, if the investor has a short position, he posts collateral to the counterparty and a higher borrowing rate would increase his costs of funding the collateral he has to post as well as his delta-hedge position. Table 2 reports the results for the short and long positions in the call option when rehypothecation is allowed. Figures 2 and 3 plot the values of collateralized long and short positions in the call option as a function of asymmetric funding spreads. In addition, Fig. 4

**Fig. 2** The value of a long call position for asymmetric funding spreads *s*− *<sup>f</sup>* = *f* <sup>−</sup> − *r*, i.e. fixing *f* <sup>+</sup> = *r* = 0.01 and varying *f* <sup>−</sup> ∈ (0.01, 0.0125, 0.015, 0.0175, 0.02)

**Fig. 3** The value of a short call position for asymmetric funding spreads *s*+ *<sup>f</sup>* = *f* <sup>+</sup> − *r*, i.e. fixing *f* <sup>−</sup> = *r* = 0.01 and varying *f* <sup>+</sup> ∈ (0.01, 0.0125, 0.015, 0.0175, 0.02)

**Fig. 4** Funding valuation adjustment as a function of asymmetric funding spreads. The adjustments are computed under the presence of default risk and collateralization

reports the FVA with respect to the magnitude of the funding spreads, where the FVA is defined as the difference between the full funding-inclusive deal price and the full deal price, but symmetric funding rates equal to the risk-free rate. Recall that the collateral rates are equal to the risk-free rate, so the LVA collapses to zero in these examples.

This shows that funding asymmetry matters even under full collateralization when there is no repo market for the underlying stock. In practice, however, the dealer cannot hedge a long call by shorting a stock he does not own. Instead, he would first borrow the stock in a repo transaction and then sell it in the spot market. Similarly, to enter the long delta position needed to hedge a short call, the dealer could finance the purchase by lending the stock in a reverse repo transaction. Effectively, the delta position in the underlying stock would be funded at the prevailing repo rate. Thus, once the delta hedge has to be executed through the repo market, there is no funding valuation adjustment (meaning any dependence on the funding rate ˜*f* drops out) given the deal is fully collateralized, but the underlying asset still grows at the repo rate. If there is no credit risk, this would leave us with the result of Piterbarg [36]. However, if the deal is not fully collateralized or the collateral cannot be rehypothecated, funding costs enter the picture even when there is a repo market for the underlying stock.

# *4.5 Nonlinearity Valuation Adjustment*

In this last section we introduce a nonlinearity valuation adjustment, and to stay within the usual jargon of the business, we abbreviate it NVA. The NVA is defined by the difference between the true price *V*¯ and a version of *V*¯ where nonlinearities have been approximated away through blunt symmetrization of rates and possibly a change in the close-out convention from replacement close-out to risk-free closeout. This entails a degree of double counting (both positive and negative interest). In some situations the positive and negative double counting will offset each other, but in other cases this may not happen. Moreover, as pointed out by Brigo et al. [10], a further source of double counting might be neglecting the first-to-default time in bilateral CVA/DVA valuation. This is done in a number of industry approximations.

Let *V*ˆ be the resulting price when we replace both *f* <sup>+</sup> and *f* <sup>−</sup> by ˆ*f* := ( *f* <sup>+</sup> + *f* <sup>−</sup>)/2 and adopt a risk-free close-out at default in our valuation framework. A further simplification in *V*ˆ could be to neglect the first-to-default check in the close-out. We have the following definition

**Definition 1** (*Nonlinearity Valuation Adjustment, NVA*) NVA is defined as

$$\mathbf{NVA}\_t \triangleq V\_t - V\_t$$

where *V*¯ denotes the full nonlinear deal value while *V*ˆ denotes an approximate linearized price of the deal.

**Fig. 5** Nonlinearity valuation adjustment (in percentage of *V*ˆ ) for different funding spreads *s*<sup>+</sup> *f* = *f* <sup>+</sup> − *f* <sup>−</sup> ∈ (0, 0.005, 0.01, 0.015, 0.02) and fixed ˆ*f* = ( *f* <sup>+</sup> + *f* <sup>−</sup>)/2 = 0.01

As an illustration, we revisit the above example of an equity call option and analyze the NVA in a number of cases. The results are reported in Figs. 5 and 6.

In both figures, we compare NVA under risk-free close-out and under replacement close-out. We can see that, depending on the direction of the symmetrization, NVA may be either positive or negative. As the funding spread increases, NVA grows in absolute value. In addition, adopting the replacement close-out amplifies the presence of double counting. The NVA accounts for up to 15% of the full deal price *V*¯ depending on the funding spread—a relevant figure in a valuation context.

Table <sup>3</sup> reports (a) %NVA denoting the fraction of the approximated deal price - *V*ˆ explained by NVA, and (b) %NVA denoting the fraction of the full deal price *V*¯ (with symmetric funding rates equal to the risk-free rate *r*) explained by NVA. Notice that for those cases where we adopt a risk-free close-out at default, the results primarily highlight the double-counting error due to symmetrization of borrowing and lending rates. We should point out that close-out nonlinearities play a limited role here, due to absence of wrong way risk. An analysis of close-out nonlinearity under wrong way risk is under development.

Finally, it should be noted that linearization may in fact be done in arbitrarily many ways by playing with the discount factor, hence taking the average of two funding rates as in our definition of NVA is not necessarily the best one. However, we postpone further investigations into this interesting topic for future research.

**Fig. 6** Nonlinearity valuation adjustment (in percentage of *V*ˆ ) for different funding spreads *s*<sup>−</sup> *f* = *f* <sup>−</sup> − *f* <sup>+</sup> ∈ (0, 0.005, 0.01, 0.015, 0.02) and fixed ˆ*f* = ( *f* <sup>+</sup> + *f* <sup>−</sup>)/2 = 0.01


**Table 3** %NVA with default risk, collateralization and rehypothecation

aFunding spread *sf* <sup>=</sup> *<sup>f</sup>* <sup>−</sup> <sup>−</sup> *<sup>f</sup>* <sup>+</sup>

bThe prices of the call option are based on the joint default distribution *D*high with high dependence

# **5 Conclusions and Financial Implications**

We have developed a consistent framework for valuation of derivative trades under collateralization, counterparty credit risk, and funding costs. Based on no arbitrage, we derived a generalized pricing equation where CVA, DVA, LVA, and FVA are introduced by simply modifying the payout cash flows of the trade. The framework is flexible enough to accommodate actual trading complexities such as asymmetric collateral and funding rates, replacement close-out, and rehypothecation of posted collateral. Moreover, we presented an invariance theorem showing that the valuation framework does not depend on any theoretical risk-free rate, but is purely based on observable market rates.

The generalized valuation equation under credit, collateral, and funding takes the form of a forward–backward SDE or semi-linear PDE. Nevertheless, it can be recast as a set of iterative equations which can be efficiently solved by a proposed leastsquares Monte Carlo algorithm. Our numerical results confirm that funding risk as well as asymmetries in borrowing and lending rates have a significant impact on the ultimate value of a derivatives transaction.

Introducing funding costs into the pricing equation makes the valuation problem recursive and nonlinear. The price of the deal depends on the trader's funding strategy, while to determine the funding strategy we need to know the deal price itself. Credit and funding risks are in general non-separable; this means that FVA is not an additive adjustment, let alone a discounting spread. Thus, despite being common practice among market participants, treating it as such comes at the cost of double counting. We introduce the "nonlinearity valuation adjustment" (NVA) to quantify the effect of double counting and we show that its magnitude can be significant under asymmetric funding rates and replacement close-out at default.

Furthermore, valuation under funding costs is no longer bilateral as the particular funding policy chosen by the dealer is not known to the client, and vice versa. As a result, the value of the trade will generally be different to the two counterparties.

Finally, valuation depends on the level of aggregation; asset portfolios cannot simply be priced separately and added up. Theoretically, valuation is conducted under deal or portfolio-dependent risk-neutral measures. This has clear operational consequences for financial institutions; it is difficult for banks to establish CVA and FVA desks with separate, clear-cut responsibilities. In theory, they should adopt a consistent valuation approach across all trading desks and asset classes. A trade should be priced on an appropriate aggregation-level to quantify the value it actually adds to the business. This, of course, prompts to the old distinction between price and value: Should funding costs be charged to the client or just included internally to determine the profitability of a particular trade? The relevance of this question is reinforced by the fact that the client has no direct control on the funding policy of the bank and therefore cannot influence any potential inefficiencies for which he or she would have to pay.

While holistic trading applications may be unrealistic with current technology, our valuation framework offers a unique understanding of the nature and presence of nonlinearities and paves the way for developing more suitable and practical linearizations. The latter topic we will leave for future research.

**Acknowledgements** The KPMG Center of Excellence in Risk Management is acknowledged for organizing the conference "Challenges in Derivatives Markets - Fixed Income Modeling, Valuation Adjustments, Risk Management, and Regulation".

**Open Access** This chapter is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, duplication, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, a link is provided to the Creative Commons license and any changes made are indicated.

The images or other third party material in this chapter are included in the work's Creative Commons license, unless indicated otherwise in the credit line; if such material is not included in the work's Creative Commons license and the respective action is not permitted by statutory regulation, users will need to obtain permission from the license holder to duplicate, adapt or reproduce the material.

# **References**


# **Analysis of Nonlinear Valuation Equations Under Credit and Funding Effects**

**Damiano Brigo, Marco Francischello and Andrea Pallavicini**

**Abstract** We study conditions for existence, uniqueness, and invariance of the comprehensive nonlinear valuation equations first introduced in Pallavicini et al. (Funding valuation adjustment: a consistent framework including CVA, DVA, collateral, netting rules and re-hypothecation, 2011, [11]). These equations take the form of semi-linear PDEs and Forward–Backward Stochastic Differential Equations (FBSDEs). After summarizing the cash flows definitions allowing us to extend valuation to credit risk and default closeout, including collateral margining with possible re-hypothecation, and treasury funding costs, we show how such cash flows, when present-valued in an arbitrage-free setting, lead to semi-linear PDEs or more generally to FBSDEs.We provide conditions for existence and uniqueness of such solutions in a classical sense, discussing the role of the hedging strategy. We show an invariance theorem stating that even though we start from a risk-neutral valuation approach based on a locally risk-free bank account growing at a risk-free rate, our final valuation equations do not depend on the risk-free rate. Indeed, our final semi-linear PDE or FBSDEs and their classical solutions depend only on contractual, market or treasury rates and we do not need to proxy the risk-free rate with a real market rate, since it acts as an instrumental variable. The equations' derivations, their numerical solutions, the related XVA valuation adjustments with their overlap, and the invariance result had been analyzed numerically and extended to central clearing and multiple discount curves in a number of previous works, including Brigo and Pallavicini (J. Financ. Eng. 1(1):1–60 (2014), [3]), Pallavicini and Brigo (Interest-rate modelling in collateralized markets: multiple curves, credit-liquidity effects, CCPs, 2011, [10]), Pallavicini et al. (Funding valuation adjustment: a consistent framework including cva, dva, collateral, netting rules and re-hypothecation, 2011, [11]), Pallavicini et al. (Funding, collateral and hedging: uncovering the mechanics and the subtleties of

D. Brigo (B) · M. Francischello

Imperial College London, London SW7 2AZ, UK e-mail: damiano.brigo@imperial.ac.uk

M. Francischello e-mail: m.francischello14@imperial.ac.uk

A. Pallavicini Banca IMI, Largo Mattioli 3, Milan 20121, Italy e-mail: andrea.pallavicini@imperial.ac.uk

funding valuation adjustments, 2012, [12]), and Brigo et al. (Nonlinear valuation under collateral, credit risk and funding costs: a numerical case study extending Black–Scholes, [5]).

**Keywords** Counterparty credit risk · Funding valuation adjustment · Funding costs · Collateralization · Nonlinearity valuation adjustment · Nonlinear valuation · Derivatives valuation · Semi-linear PDE · FBSDE · BSDE · Existence and uniqueness of solutions

# **1 Introduction**

This is a technical paper where we analyze in detail invariance, existence, and uniqueness of solutions for nonlinear valuation equations inclusive of credit risk, collateral margining with possible re-hypothecation, and funding costs. In particular, we study conditions for existence, uniqueness, and invariance of the comprehensive nonlinear valuation equations first introduced in Pallavicini et al. (2011) [11]. After briefly summarizing the cash flows definitions allowing us to extend valuation to default closeout, collateral margining with possible re-hypothecation and treasury funding costs, we show how such cash flows, when present-valued in an arbitrage-free setting, lead straightforwardly to semi-linear PDEs or more generally to FBSDEs. We study conditions for existence and uniqueness of such solutions.

We formalize an invariance theorem showing that even though we start from a risk-neutral valuation approach based on a locally risk-free bank account growing at a risk-free rate, our final valuation equations do not depend on the risk-free rate at all. In other words, we do not need to proxy the risk-free rate with any actual market rate, since it acts as an instrumental variable that does not manifest itself in our final valuation equations. Indeed, our final semi-linear PDEs or FBSDEs and their classical solutions depend only on contractual, market or treasury rates and contractual closeout specifications once we use a hedging strategy that is defined as a straightforward generalization of the natural delta hedging in the classical setting.

The equations' derivations, their numerical solutions, and the invariance result had been analyzed numerically and extended to central clearing and multiple discount curves in a number of previous works, including [3, 5, 10–12], and the monograph [6], which further summarizes earlier credit and debit valuation adjustment (CVA and DVA) results. We refer to such works and references therein for a general introduction to comprehensive nonlinear valuation and to the related issues with valuation adjustments related to credit (CVA), collateral (LVA), and funding costs (FVA). In this paper, given the technical nature of our investigation and the emphasis on nonlinear valuation, we refrain from decomposing the nonlinear value into valuation adjustments or XVAs. Moreover, in practice such separation is possible only under very specific assumptions, while in general all terms depend on all risks due to nonlinearity. Forcing separation may lead to double counting, as initially analyzed through the Nonlinearity Valuation Adjustment (NVA) in [5]. Separation is discussed in the CCP setting in [3].

The paper is structured as follows.

Section 2 introduces the probabilistic setting, the cash flows analysis, and derives a first valuation equation based on conditional expectations. Section 3 derives an FBSDE under the default-free filtration from the initial valuation equation under assumptions of conditional independence of default times and of default-free initial portfolio cash flows. Section 4 specifies the FBSDE obtained earlier to a Markovian setting and studies conditions for existence and uniqueness of solutions for the nonlinear valuation FBSDE and classical solutions to the associated PDE. Finally, we present the invariance theorem: when adopting delta-hedging, the solution does not depend on the risk-free rate.

# **2 Cash Flows Analysis and First Valuation Equation**

We fix a filtered probability space (Ω, *A* , Q), with a filtration (*Gu*)*<sup>u</sup>*≥<sup>0</sup> representing the evolution of all the available information on the market. With an abuse of notation, we will refer to (*Gu*)*<sup>u</sup>*≥<sup>0</sup> by*G* . The object of our investigation is a portfolio of contracts, or "contract" for brevity, typically a netting set, with final maturity *T*, between two financial entities, the investor *I* and the counterparty *C*. Both *I* and *C* are supposed to be subject to default risk. In particular we model their default times with two *G* -stopping times τ*I*, τ*C*. We assume that the stopping times are generated by Cox processes of positive, stochastic intensities λ*<sup>I</sup>* and λ*<sup>C</sup>*. Furthermore, we describe the *default-free* information by means of a filtration (*Fu*)*<sup>u</sup>*≥<sup>0</sup> generated by the price of the underlying *St* of our contract. This process has the following dynamic under the measure Q:

$$dS\_t = r\_t \mathcal{S}\_t dt + \sigma(t, \mathcal{S}\_t) dW\_t$$

where *rt* is an *F*-adapted process, called the *risk-free* rate. We then suppose the existence of a risk-free account *Bt* following the dynamics

$$dB\_t = r\_t B\_t dt.$$

We denote *<sup>D</sup>*(*s*, *<sup>t</sup>*, *<sup>x</sup>*) <sup>=</sup> *<sup>e</sup>*<sup>−</sup> *t <sup>s</sup> xudu*, the discount factor associated to the rate *xu*. In the case of the risk-free rate, we define *D*(*s*, *t*) := *D*(*s*, *t*,*r*).

We further assume that for all *t* we have *G<sup>t</sup>* = *F<sup>t</sup>* ∨ *H <sup>I</sup> <sup>t</sup>* ∨ *H <sup>C</sup> <sup>t</sup>* where

$$\begin{aligned} \mathcal{H}\_t^{\mathbb{O}I} &= \sigma(\mathbf{1}\_{\{\mathbf{t} \le \mathbf{s}\}}, \ s \le t), \\ \mathcal{H}\_t^{\mathbb{O}C} &= \sigma(\mathbf{1}\_{\{\mathbf{t} \le \mathbf{s}\}}, \ s \le t). \end{aligned}$$

Again we indicate (*Fu*)*<sup>u</sup>*≥<sup>0</sup> by *F* and we will write E*<sup>G</sup> <sup>t</sup>* [·] := E[·|*Gt*] and similarly for *F*. As in the classic framework of Duffie and Huang [8], we postulate the default times to be *conditionally independent* with respect to *F*, i.e. for any *t* > 0 and *t*1, *t*<sup>2</sup> ∈ [0, *t*], we assume Q{τ*<sup>I</sup>* > *t*1, τ*<sup>C</sup>* > *t*2|*Ft*} = Q{τ*<sup>I</sup>* > *t*1|*Ft*}Q{τ*<sup>C</sup>* > *t*2|*Ft*}. Moreover, we indicate τ = τ*<sup>I</sup>* ∧ τ*<sup>C</sup>* and with these assumptions we have that τ has intensity λ*<sup>u</sup>* = λ*<sup>I</sup> <sup>u</sup>* + λ*<sup>C</sup> <sup>u</sup>* . For convenience of notation we use the symbol τ¯ to indicate the minimum between τ and *T*.

*Remark 1* We suppose that the measure Q is the so-called *risk-neutral* measure, i.e. a measure under which the prices of the traded non-dividend-paying assets discounted at the risk-free rate are martingales or, in equivalent terms, the measure associated with the numeraire *Bt*.

# *2.1 The Cash Flows*

To price this portfolio we take the conditional expectation of all the cash flows of the portfolio and discount them at the risk-free rate. An alternative to the explicit cash flows approach adopted here is discussed in [4].

To begin with, we consider a collateralized hedged contract, so the cash flows generated by the contract are:

• The payments due to the contract itself: modeled by an *F*-predictable process π*<sup>t</sup>* and a final cash flow Φ(*ST* ) payed at maturity modeled by a Lipschitz function Φ. At time *t* the cumulated discounted flows due to these components amount to

$$1\_{\{t > T\}} D(0, T) \Phi(S\_T) + \int\_t^{\vec{\tau}} D(t, u) \pi\_u du.$$

• The payments due to default: in particular we suppose that at time τ we have a cash flow due to the default event (if it happened) modeled by a *G*<sup>τ</sup> -measurable random variable θτ . So the flows due to this component are

$$1\_{\{t < \tau < T\}} D(t, \tau) \theta\_{\tau} = 1\_{\{t < \tau < T\}} \int\_{t}^{T} D(t, u) \theta\_{u} d\operatorname{l}\_{\{\tau \le u\}}.$$

• The payments due to the collateral account: more precisely we model this account by an *F*-predictable process *Ct*. We postulate that *Ct* > 0 if the investor is the collateral taker, and *Ct* < 0 if the investor is the collateral provider. Moreover, we assume that the collateral taker remunerates the account at a certain interest rate (written on the CSA); in particular we may have different rates depending on who the collateral taker is, so we introduce the rate

$$c\_t = 1\_{\{C\_t \ge 0\}} c\_t^+ + 1\_{\{C\_t \le 0\}} c\_t^- \,, \tag{l}$$

where *c*<sup>+</sup> *<sup>t</sup>* , *c*<sup>−</sup> *<sup>t</sup>* are two *F*-predictable processes. We also suppose that the collateral can be re-hypothecated, i.e. the collateral taker can use the collateral for funding purposes. Since the collateral taker has to remunerate the account at the rate *ct*, the discounted flows due to the collateral can be expressed as a cost of carry and sum up to

$$\int\_{t}^{\overline{t}} D(t, u)(r\_u - c\_u)C\_u du.$$

• We suppose that the deal we are considering is to be hedged by a position in cash and risky assets, represented respectively by the *G* -adapted processes *Ft* and *Ht*, with the convention that *Ft* > 0 means that the investor is borrowing money (from the bank's treasury for example), while *F* < 0 means that *I* is investing money. Also in this case to take into account different rates in the borrowing or lending case we introduce the rate

$$f\_t = 1\_{\{V\_t - C\_t \ge 0\}} f\_t^+ + 1\_{\{V\_t - C\_t \le 0\}} f\_t^- \,. \tag{2}$$

The flows due to the funding part are

$$\int\_{t}^{\vec{\pi}} D(t, u)(r\_u - f\_u) F\_u du.$$

For the flows related to the risky assets account *Ht* we assume that we are hedging by means of repo contracts. We have that *Ht* > 0 means that we need some risky asset, so we borrow it, while if *H* < 0 we lend. So, for example, if we need to borrow the risky asset we need cash from the treasury, hence we borrow cash at a rate *ft* and as soon as we have the asset we can repo lend it at a rate *ht*. In general *ht* is defined as

$$h\_t = \mathbf{1}\_{\{H\_t > 0\}} h\_t^+ + \mathbf{1}\_{\{H\_t \le 0\}} h\_t^-. \tag{3}$$

Thus we have that the total discounted cash flows for the risky part of the hedge are equal to

$$\int\_{t}^{\vec{\pi}} D(t, u)(h\_{\mu} - f\_{\mu})H\_{\mu}d\mu.$$

The last expression could also be seen as resulting from (*r* − *f*) − (*r* − *h*), in line with the previous definitions. If we add all the cash flows mentioned above we obtain that the value of the contract *Vt* must satisfy

$$\begin{split} V\_t &= \mathbb{E}\_t^{\mathcal{G}} \left[ \int\_t^{\bar{\tau}} D(t, u) (\pi\_u + (r\_u - c\_u)C\_u + (r\_u - f\_u)F\_u - (f\_u - h\_u)H\_u) du \right] \\ &+ \mathbb{E}\_t^{\mathcal{G}} \left[ \mathbf{1}\_{\{t > T\}} D(t, T)\Phi(S\_T) + D(t, \tau)\mathbf{1}\_{\{t < \tau < T\}} \theta\_\tau \right]. \end{split} \tag{4}$$

If we further suppose that we are able to replicate the value of our contract using the funding, the collateral (assuming re-hypothecation, otherwise *C* is to be omitted from the following equation) and the risky asset accounts, i.e.

$$V\_u = F\_u + H\_u + C\_u,\tag{5}$$

we have, substituting for *Fu*:

$$\begin{split} V\_t &= \mathbb{E}\_t^{\mathcal{G}} \left[ \int\_t^{\bar{\tau}} D(t, u) (\pi\_u + (f\_u - c\_u)C\_u + (r\_u - f\_u)V\_u - (r\_u - h\_u)H\_u) du \right] \\ &+ \mathbb{E}\_t^{\mathcal{G}} \left[ \mathbf{1}\_{\{t > T\}} D(t, T)\varPhi(S\_T) + D(t, \tau)\mathbf{1}\_{\{t < \tau < T\}} \theta\_\tau \right]. \end{split} \tag{6}$$

*Remark 2* In the classic no-arbitrage theory and in a complete market setting, without credit risk, the hedging process *H* would correspond to a delta hedging strategy account. Here we do not enforce this interpretation yet. However, we will see that a delta-hedging interpretation emerges from the combined effect of working under the default-free filtration *F* (valuation under partial information) and of identifying part of the solution of the resulting BSDE, under reasonable regularity assumptions, as a sensitivity of the value to the underlying asset price *S*.

# *2.2 Adjusted Cash Flows Under a Simple Trading Model*

We now show how the adjusted cash flows originate assuming we buy a call option on an equity asset *ST* with strike *K*. We analyze the operations a trader would enact with the treasury and the repo market in order to fund the trade, and we map these operations to the related cash flows. We go through the following steps in each small interval [*t*, *t* + *dt*], seen from the point of view of the trader/investor buying the option. This is written in first person for clarity and is based on conversations with traders working with their bank treasuries.

Time *t*:


Time *t* + *dt*:


$$H\_t(1 + h\_t \, dt) - \Delta\_t \mathbf{S}\_{t + dt} = -\Delta\_t \, d\mathbf{S}\_t + h\_t H\_t \, dt$$

Notice that this −Δ*tdSt* is the right amount I needed to hedge *V* in a classic delta hedging setting.


$$V\_{t+dt} - V\_t(1 + f\_t \, dt) - C\_t(c\_t - f\_t) \, dt = dV\_t - f\_t V\_t \, dt - C\_t(c\_t - f\_t) \, dt$$

16. I now have that the total amount of flows is:

$$-\Delta\_t \, dS\_t + h\_t H\_t \, dt + dV\_t - f\_t V\_t \, dt - \mathcal{C}\_t (c\_t - f\_t) \, dt$$

17. Now I present-value the above flows in *t* in a risk-neutral setting.

$$\begin{aligned} &\mathbb{E}\_{l}[-\Delta\_{l}\,dS\_{l} + h\_{l}H\_{l}\,dt + dV\_{l} - f\_{l}V\_{l}\,dt - C\_{l}(c\_{l} - f\_{l})\,dt] \\ &= -\Delta\_{l}(r\_{l} - h\_{l})S\_{l}\,dt + (r\_{l} - f\_{l})V\_{l}\,dt - C\_{l}(c\_{l} - f\_{l})\,dt - d\varphi(t) \\ &= -H\_{l}(r\_{l} - h\_{l})\,dt + (r\_{l} - f\_{l})(H\_{l} + F\_{l} + C\_{l})\,dt - C\_{l}(c\_{l} - f\_{l})\,dt - d\varphi(t) \\ &= (h\_{l} - f\_{l})H\_{l}\,dt + (r\_{l} - f\_{l})F\_{l}\,dt + (r\_{l} - c\_{l})C\_{l}\,dt - d\varphi(t) \end{aligned}$$

This derivation holds assuming that E*t*[*dSt*] = *rtSt dt* and E*t*[*dVt*] = *rtVt dt* − *d*ϕ(*t*), where *d*ϕ is a dividend of *V* in [*t*, *t* + *dt*) expressing the funding costs. Setting the above expression to zero we obtain

$$d\varphi(t) = (h\_t - f\_t)H\_t \, dt + (r\_t - f\_t)F\_t \, dt + (r\_t - c\_t)C\_t \, dt$$

which coincides with the definition given earlier in (6).

# **3 An FBSDE Under** *F*

We aim to switch to the default free filtration *F* = (*Ft*)*<sup>t</sup>*≥0, and the following lemma (taken from Bielecki and Rutkowski [1] Sect. 5.1) is the key in understanding how the information expressed by *G* relates to the one expressed by *F*.

**Lemma 1** *For any A -measurable random variable X and any t* ∈ R+*, we have:*

$$\mathbb{E}\_{t}^{\mathcal{G}}\left[\mathbf{1}\_{\{t<\tau\leq s\}}X\right] = \mathbf{1}\_{\{\tau>t\}}\frac{\mathbb{E}\_{t}^{\mathcal{G}}\left[\mathbf{1}\_{\{t<\tau\leq s\}}X\right]}{\mathbb{E}\_{t}^{\mathcal{G}}\left[\mathbf{1}\_{\{t\geq t\}}\right]}.\tag{7}$$

*In particular we have that for any Gt-measurable random variable Y there exists an Ft-measurable random variable Z such that*

$$1\_{\{\mathfrak{t}>t\}}Y = 1\_{\{\mathfrak{t}>t\}}Z.$$

What follows is an application of the previous lemma exploiting the fact that we have to deal with a stochastic process structure and not only a simple random variable. Similar results are illustrated in [2].

**Lemma 2** *Suppose that* φ*<sup>u</sup> is a G -adapted process. We consider a default time* τ *with intensity* λ*u. If we denote* τ¯ = τ ∧ *T we have:*

$$\mathbb{E}\_{\mathfrak{l}}^{\mathcal{G}} \left[ \int\_{\mathfrak{l}}^{\vec{\tau}} \phi\_{\mathfrak{u}} d\mathfrak{u} \right] = \mathbf{1}\_{\{\tau > t\}} \mathbb{E}\_{\mathfrak{l}}^{\mathcal{G}} \left[ \int\_{\mathfrak{l}}^{T} D(t, \mathfrak{u}, \lambda) \widetilde{\phi}\_{\mathfrak{u}} d\mathfrak{u} \right]$$

*where* φ *<sup>u</sup> is an <sup>F</sup><sup>u</sup> measurable variable such that* <sup>1</sup>{τ>*u*}<sup>φ</sup> *<sup>u</sup>* <sup>=</sup> <sup>1</sup>{τ>*u*}φ*u.*

*Proof*

$$\mathbb{E}\_t^{\mathcal{G}} \left[ \int\_t^{\vec{\tau}} \phi\_u du \right] = \mathbb{E}\_t^{\mathcal{G}} \left[ \int\_t^T \mathbf{1}\_{\{\tau > t\}} \mathbf{1}\_{\{\tau > u\}} \phi\_u du \right] = \int\_t^T \mathbb{E}\_t^{\mathcal{G}} \left[ \mathbf{1}\_{\{\tau > t\}} \mathbf{1}\_{\{\tau > u\}} \phi\_u \right] du$$

then by using Lemma 1 we have

$$=\int\_{t}^{T} \mathbf{1}\_{\{\tau>t\}} \frac{\mathbb{E}\_{t}^{\mathcal{F}} \left[\mathbf{1}\_{\{\tau>t\}} \mathbf{1}\_{\{\tau>u\}} \phi\_{u}\right]}{\mathbb{Q}[\tau>t \mid \mathcal{F}\_{t}]} du = \mathbf{1}\_{\{\tau>t\}} \int\_{t}^{T} \mathbb{E}\_{t}^{\mathcal{F}} \left[\mathbf{1}\_{\{\tau>u\}} \phi\_{u}\right] D(0,t,\lambda)^{-1} du$$

now we choose an *F<sup>u</sup>* measurable variable such that 1{τ>*u*}φ *<sup>u</sup>* <sup>=</sup> <sup>1</sup>{τ>*u*}φ*<sup>u</sup>* and obtain

$$\begin{aligned} 0 &= \mathbf{1}\_{\{\mathbf{t} > \boldsymbol{I}\}} \int\_{\boldsymbol{I}}^{T} \mathbb{E}\_{\mathbf{t}}^{\mathcal{P}} \left[ \mathbb{E}\_{\boldsymbol{u}}^{\mathcal{P}} \left[ \mathbf{1}\_{\{\mathbf{t} > \boldsymbol{u}\}} \right] \tilde{\phi}\_{\boldsymbol{u}} \right] D(\mathbf{0}, \boldsymbol{t}, \boldsymbol{\lambda})^{-1} d\boldsymbol{u} \\ &= \mathbf{1}\_{\{\mathbf{t} > \boldsymbol{t}\}} \int\_{\boldsymbol{I}}^{T} \mathbb{E}\_{\boldsymbol{u}}^{\mathcal{P}} \left[ D(\mathbf{0}, \boldsymbol{u}, \boldsymbol{\lambda}) \tilde{\phi}\_{\boldsymbol{u}} \right] D(\mathbf{0}, \boldsymbol{t}, \boldsymbol{\lambda})^{-1} d\boldsymbol{u} = \mathbf{1}\_{\{\mathbf{t} > \boldsymbol{t}\}} \mathbb{E}\_{\boldsymbol{I}}^{\mathcal{P}} \left[ \int\_{\boldsymbol{I}}^{T} D(\mathbf{t}, \boldsymbol{u}, \boldsymbol{\lambda}) \tilde{\phi}\_{\boldsymbol{u}} d\mathbf{u} \right] \end{aligned}$$

where the penultimate equality comes from the fact that the default times are conditionally independent and if we define Λ*X*(*u*) = *u* <sup>0</sup> λ*<sup>X</sup> <sup>s</sup> ds* with *X* ∈ {*I*,*C*} we have that <sup>τ</sup>*<sup>X</sup>* <sup>=</sup> <sup>Λ</sup><sup>−</sup><sup>1</sup> *<sup>X</sup>* (ξ*X*) with ξ*<sup>X</sup>* mutually independent exponential random variables independent from λ*<sup>X</sup>*. <sup>1</sup> A similar result will enable us to deal with the default cash flow term. In fact we have the following (Lemma 3.8.1 in [2])

**Lemma 3** *Suppose that* φ*<sup>u</sup> is an F-predictable process. We consider two conditionally independent default times* τ*I*, τ*<sup>C</sup> generated by Cox processes with F-intensity rates* λ*<sup>I</sup> t*, λ*<sup>C</sup> <sup>t</sup> . If we denote* τ = τ*<sup>C</sup>* ∧ τ*<sup>I</sup> we have:*

$$\mathbb{E}\_t^{\mathcal{J}} \left[ \mathbf{1}\_{\{t < \tau < T\}} \mathbf{1}\_{\{\tau\_l < \tau\_C\}} \phi\_\tau \right] = \mathbf{1}\_{\{t > t\}} \mathbb{E}\_t^{\mathcal{J}} \left[ \int\_t^T D(t, u, \lambda^I + \lambda^C) \lambda\_u^I \phi\_u du \right].$$

Now we postulate a particular form for the default cash flow, more precisely if we indicate *V <sup>t</sup>* the *<sup>F</sup>*-adapted process such that

$$1\_{\{\tau>t\}}\widetilde{V}\_t = 1\_{\{\tau>t\}}V\_t$$

then we define

$$\theta\_t = \epsilon\_t - 1\_{\{\tau\_C \le \tau\_l\}} LGD\_C(\epsilon\_t - C\_t)^+ + 1\_{\{\tau\_l \le \tau\_C\}} LGD\_I(\epsilon\_t - C\_t)^-.$$

Where *LGD* indicates the loss given default, typically defined as 1 − *REC*, where *REC* is the corresponding recovery rate and (*x*)<sup>+</sup> indicates the positive part of *x* and (*x*)<sup>−</sup> = −(−*x*)+. The meaning of these flows is the following, consider θτ :


A similar reasoning applies to the case when the Investor defaults.

If we now change filtration, we obtain the following expression for *Vt* (where we omitted the tilde sign over the rates, see Remark 3):

<sup>1</sup>See for example Sect. 8.2.1 and Lemma 9.1.1 of [1].

46 D. Brigo et al.

$$\begin{split} V\_{l} &= \mathbf{1}\_{\{\mathbf{r} > l\}} \mathbb{E}\_{I}^{\mathcal{P}} \left[ \int\_{I}^{T} D(t, u, r + \lambda) ((f\_{\mathbf{u}} - c\_{\mathbf{u}}) C\_{\mathbf{u}} + (r\_{\mathbf{u}} - f\_{\mathbf{u}}) \tilde{V}\_{\mathbf{u}} - (r\_{\mathbf{u}} - h\_{\mathbf{u}}) \tilde{H}\_{\mathbf{u}}) du \right] \\ &+ \mathbf{1}\_{\{\mathbf{r} > l\}} \mathbb{E}\_{I}^{\mathcal{P}} \left[ D(t, T, r + \lambda) \Phi(S\_{T}) + \int\_{I}^{T} D(t, u, r + \lambda) \pi\_{\mathbf{u}} du \right] \\ &+ \mathbf{1}\_{\{\mathbf{r} > l\}} \mathbb{E}\_{I}^{\mathcal{P}} \left[ \int\_{I}^{T} D(t, u, r + \lambda) \tilde{\theta}\_{\mathbf{u}} du \right], \end{split} \tag{8}$$

where, if we suppose *<sup>t</sup>* to be *F*-predictable, we have (using Lemma 3):

$$
\tilde{\theta}\_u = \epsilon\_u \lambda\_u - LGD\_C(\epsilon\_u - C\_u)^+ \lambda\_u^C + LGD\_I(\epsilon\_u - C\_u)^- \lambda\_u^I. \tag{9}
$$

*Remark 3* From now on we will omit the tilde sign over the rates *fu*, *hu*. Moreover, we note that if a rate is of the form

$$\mathbf{x}\_t = \mathbf{x}^+ \mathbf{1}\_{\{\mathcal{S}(V\_t, H\_t, C\_t) \succeq 0\}} + \boldsymbol{\chi}^- \mathbf{1}\_{\{\mathcal{S}(V\_t, H\_t, C\_t) \succeq 0\}}$$

then on the set {τ > *t*} it coincides with the rate

$$
\widetilde{\boldsymbol{\mathfrak{X}}}\_t = \widetilde{\boldsymbol{\mathfrak{X}}}^+ \boldsymbol{1}\_{\{\mathcal{S}(\widetilde{V}\_t, \widetilde{H}\_t, C\_t) \simeq 0\}} + \widetilde{\boldsymbol{\mathfrak{X}}}^- \boldsymbol{1}\_{\{\mathcal{S}(\widetilde{V}\_t, \widetilde{H}\_t, C\_t) \not\subseteq 0\}}
$$

because 1{τ>*t*}*x*+1{*g*(*Vt*,*Ht*,*Ct*)>0} <sup>=</sup>*x*+1{τ>*t*}1{*g*(*Vt*,*Ht*,*Ct*)>0}, and on {τ > *<sup>t</sup>*} we have *Vt* = *V <sup>t</sup>* and *Ht* <sup>=</sup> *<sup>H</sup> <sup>t</sup>*, and hence *<sup>g</sup>*(*Vt*, *Ht*,*Ct*) > <sup>0</sup> ⇐⇒ *<sup>g</sup>*(*<sup>V</sup> <sup>t</sup>*, *<sup>H</sup> <sup>t</sup>*,*Ct*) > 0.

We note that this expression is of the form *Vt* = 1{τ>*t*}ϒ meaning that *Vt* is zero on {τ ≤ *t*} and that on the set {τ > *t*} it coincides with the *F*-measurable random variable ϒ. But we already know a variable that coincides with *Vt* on {τ > *t*}, i.e. *V <sup>t</sup>*. Hence we can write the following:

$$\begin{split} \tilde{V}\_{l} &= \mathbb{E}\_{l}^{\mathcal{G}} \left[ \int\_{t}^{T} D(t, u, r + \lambda) (\pi\_{u} + (f\_{u} - c\_{u}) \mathbf{C}\_{u} + (r\_{u} - f\_{u}) \tilde{V}\_{u} - (r\_{u} - h\_{u}) \tilde{H}\_{u}) du \right] \\ &+ \mathbb{E}\_{l}^{\mathcal{G}} \left[ D(t, T, r + \lambda) \varPhi(S\_{T}) + \int\_{t}^{T} D(t, u, r + \lambda) \tilde{\theta}\_{u} du \right]. \end{split} \tag{10}$$

We now show a way to obtain a BSDE from Eq. (10), another possible approach (without default risk) is shown for example in [9]. We introduce the process

$$\begin{split} X\_t &= \int\_0^t D(0, u, r + \lambda) \pi\_u du + \int\_0^t D(0, u, r + \lambda) \widetilde{\theta}\_u du \\ &+ \int\_0^t D(0, u, r + \lambda) \left[ (f\_u - c\_u) C\_u + (r\_u - f\_u) \widetilde{V}\_u - (r\_u - h\_u) \widetilde{H}\_u \right] du. \end{split} \tag{11}$$

Now we can construct a martingale summing up *Xt* and the discounted value of the deal as in the following:

$$D(0, t, r + \lambda)\tilde{V}\_t + X\_t = \mathbb{E}\_t^{\mathcal{P}}[X\_T + D(0, T, r + \lambda)\Phi(S\_T)].$$

So differentiating both sides we obtain:

$$\begin{aligned} & - (r\_u + \lambda\_u) D(0, u, r + \lambda) \ddot{V}\_u du + D(0, u, r + \lambda) d\ddot{V}\_u + dX\_u \\ &= d\mathbb{E}\_u^{\sqrt{\mathbb{P}}} [X\_T + D(0, T, r + \lambda) \Phi(S\_T)]. \end{aligned}$$

If we substitute for *Xt* we have that the expression:

$$d\tilde{V}\_u + \left[\pi\_u - (r\_u + \lambda\_u)\tilde{V}\_u + \tilde{\theta}\_u + (f\_u - c\_u)C\_u + (r\_u - f\_u)\tilde{V}\_u - (r\_u - h\_u)\tilde{H}\_u\right]du$$

is equal to;

$$\frac{d\mathbb{E}\_{\boldsymbol{u}}^{\mathbb{S}^{\mathcal{F}}}[X\_{\mathcal{T}} + D(0, T, r + \lambda)\Phi(\mathcal{S}\_{\mathcal{T}})]}{D(0, \boldsymbol{u}, r + \lambda)}.$$

The process (E*<sup>F</sup> <sup>t</sup>* [*XT* + *D*(0, *T*,*r* + λ)Φ(*ST* )])*<sup>t</sup>*≥<sup>0</sup> is clearly a closed *F*-martingale, and hence

$$\int\_{0}^{t} D(0, u, r + \lambda)^{-1} d\mathbb{E}\_{u}^{\mathcal{F}}[X\_{T} + D(0, T, r + \lambda)\Phi(S\_{T})]$$

is a local *F*-martingale. Then, being

$$\int\_0^t D(0, u, r + \lambda)^{-1} d\mathbb{E}\_{\mu}^{\mathcal{F}}[X\_T + D(0, T, r + \lambda)\Phi(S\_T)]$$

adapted to the Brownian-driven filtration *F*, by the martingale representation theorem we have

$$\int\_0^t D(0, u, r + \lambda)^{-1} d\mathbb{E}\_u^{\mathcal{F}}[X\_T + D(0, T, r + \lambda)\Phi(\mathcal{S}\_T)] = \int\_0^t Z\_u dW\_u$$

for some *F*-predictable process *Zu*. Hence we can write:

$$d\tilde{V}\_u + \left[\pi\_u - (f\_u + \lambda\_u)\tilde{V}\_u + \tilde{\theta}\_u + (f\_u - c\_u)C\_u - (r\_u - h\_u)\tilde{H}\_u\right]du = Z\_u dW\_u. \quad (12)$$

# **4 Markovian FBSDE and PDE for** *<sup>V</sup>**<sup>t</sup>* **and the Invariance Theorem**

As it is, Eq. (12) is way too general, thus we will make some simplifying assumptions in order to guarantee existence and uniqueness of a solution. First we assume a Markovian setting, and hence we suppose that all the processes appearing in (12) are deterministic functions of *Su*, *V <sup>u</sup>* or *Zu* and time. More precisely we assume that:


Under our assumptions, Eq. (12) becomes the following FBSDE:

$$\begin{aligned} dS\_{I} &= r\_{I}S\_{I}dt + \sigma(t, S\_{I})dW\_{I} \\ S\_{0} &= s \\ d\tilde{V}\_{I} &= -\underbrace{\left[\pi\_{I} + \tilde{\theta}\_{I} - \lambda\_{I}\tilde{V}\_{I} + f\_{I}\tilde{V}\_{I}(\alpha\_{I} - 1) - c\_{I}(\alpha\_{I}\tilde{V}\_{I}) - (r\_{I} - h\_{I})S\_{I}\frac{Z\_{I}}{\sigma(t, S\_{I})}\right]dt + Z\_{I}dW\_{I}}\_{B(t, S\_{I}, \tilde{V}\_{I}, Z\_{I})} \\ V\_{T} &= \Phi(S\_{T}) \end{aligned} \tag{13}$$

We want to obtain existence and uniqueness of the solution to the above-mentioned FBSDE and a related PDE. A possible choice is the following (see J. Zhang [15] Theorem 2.4.1 on page 41):

<sup>2</sup>At this stage the assumption we made on *V* is not properly justified, see Theorem3 and Remark 4 for details.

**Theorem 1** *Consider the following FBSDE on* [0, *T*]*:*

$$\begin{aligned} dX\_t^{q,x} &= \mu(t, X\_t^{q,x})dt + \sigma(t, X\_t^{q,x})dW\_t \quad q < t \le T\\ X\_t &= \ge \quad 0 \le t \le q\\ dY\_t^{q,x} &= -f(t, X\_t^{q,x}, Y\_t^{q,x}, Z\_t^{q,x})dt + Z\_t^{q,x}dW\_t\\ Y\_T^{q,x} &= g(X\_T^{q,x}) \end{aligned} \tag{14}$$

*If we assume that there exists a positive constant K such that*


*and moreover the functions* μ(*t*, *x*) *and* σ (*t*, *x*) *are C*<sup>2</sup> *with bounded derivatives, then Eq.*(14) *has a unique solution* (*Xq*,*<sup>x</sup> <sup>t</sup>* , *Yq*,*<sup>x</sup> <sup>t</sup>* , *Zq*,*<sup>x</sup> <sup>t</sup>* ) *and u*(*t*, *<sup>x</sup>*) <sup>=</sup> *<sup>Y</sup>t*,*<sup>x</sup> <sup>t</sup> is the unique* classical *(i.e. C*1,2*) solution to the following semilinear PDE*

$$\begin{aligned} \partial\_t u(t, \mathbf{x}) + \frac{1}{2} \sigma(t, \mathbf{x})^2 \partial\_\mathbf{x}^2 u(t, \mathbf{x}) + \mu(t, \mathbf{x}) \partial\_\mathbf{x} u(t, \mathbf{x}) + f(t, \mathbf{x}, u(t, \mathbf{x}), \sigma(t, \mathbf{x}) \partial\_\mathbf{x} u(t, \mathbf{x})) &= 0 \\ u(T, \mathbf{x}) &= \mathbf{g}(\mathbf{x}) \end{aligned} \tag{15}$$

We cannot directly apply Theorem1 to our FBSDE because *B*(*t*,*s*, *v*,*z*) is not Lipschitz continuous in *s* because of the hedging term. But, since the hedging term is linear in *Zt* we can move it from the drift of the backward equation to the drift of the forward one. More precisely consider the following:

$$\begin{split} dS\_{l}^{q,s} &= h\_{l}S\_{l}^{q,s}dt + \sigma(t, S\_{l}^{q,s})dW\_{l} \quad q < t \le T\\ S\_{q} &= s\_{q} \quad 0 \le t \le q\\ dV\_{l}^{q,s} &= -\underbrace{\left[\pi\_{l} + \theta\_{l} - \lambda\_{t}V\_{l}^{q,s} + f\_{l}V\_{l}^{q,s}(a\_{l}-1) - c\_{l}(a\_{l}V\_{l}^{q,s})\right]}\_{B'(t, S\_{l}^{q,s}, V\_{l}^{q,s})} dt + Z\_{l}^{q,s}dW\_{l} \quad (16) \\ V\_{T}^{q,s} &= \Phi(S\_{T}^{q,s}). \end{split}$$

Indeed, one can check that the assumptions of Theorem1 are satisfied for this equation:

**Theorem 2** *If the rates* λ*t*, *ft*, *ct*, *ht*, *rt are bounded, then* |*B* (*t*,*s*, *v*) − *B* (*t*,*s* , *v* )| ≤ *K*(|*s* − *s* |+|*v* − *v* |) *and* |*B* (*t*, 0, 0)| + Φ(0) ≤ *K. Hence if* σ (*t*,*s*)*is a positive C*<sup>2</sup> *function with bounded derivatives, then the assumptions of Theorem1 are satisfied and so Eq. (16) has a unique solution, and moreover V <sup>t</sup>*,*<sup>s</sup> <sup>t</sup>* = *u*(*t*,*s*) ∈ *C*<sup>1</sup>,<sup>2</sup> *and satisfies the following semilinear PDE:*

$$\begin{aligned} \left(\partial\_t u(t,s) + \frac{1}{2}\sigma(t,s)^2 \partial\_s^2 u(t,s) + h\_l s \partial\_t u(t,s) + B'(t,s,u(t,s)) = 0 \\ u(T,s) = \Phi(s) \end{aligned} \tag{17}$$

*Proof* We start by rewriting the term

$$B'(t, \mathbf{s}, \boldsymbol{\nu}) = \pi\_t(\mathbf{s}) + \theta\_t(\boldsymbol{\nu}) + (f\_t(\alpha\_t - 1) - \lambda\_t - c\_t \alpha\_t)\boldsymbol{\nu}.$$

Since the sum of two Lipschitz functions is itself a Lipschitz function we can restrict ourselves to analyzing the summands that appear in the previous formula. The term π*<sup>t</sup>* is Lipschitz continuous in *s* by assumption. The θ term and the (*ft*(α*<sup>t</sup>* − 1) − λ*<sup>t</sup>* − *ct*α*t*)*v* term are continuous and piece-wise linear, hence Lipschitz continuous and this concludes the proof.

Note that the *S*-dynamics in (16) has the repo rate *h* as drift. Since in general *h* will depend on the future values of the deal, this is a source of nonlinearity and is at times represented informally with an expected value E*<sup>h</sup>* or a pricing measure Q*<sup>h</sup>*, see for example [5] and the related discussion on operational implications for the case *h* = *f* .

We now show that a solution to Eq. (13) can be obtained by means of the classical solution to the PDE (17). We start considering the following forward equation which is known to have a unique solution under our assumptions about σ (*t*,*s*).

$$dS\_t = r\_t \mathcal{S}\_t dt + \sigma\left(t, \mathcal{S}\_t\right) dW\_t \quad \mathcal{S}\_0 = \text{s.} \tag{18}$$

We define *Vt* = *u*(*t*, *St*) and *Zt* = σ (*t*, *St*)∂*su*(*t*, *St*). By Theorem2 we know that *u*(*t*,*s*) ∈ *C*1,<sup>2</sup> and by applying Ito's formula and (17) we obtain:

$$\begin{split}dV\_{l} &= du(t, S\_{l})\\ &= \left(\partial\_{t}u(t, S\_{l}) + r\_{l}S\_{l}\partial\_{x}u(t, S\_{l}) + \frac{1}{2}\sigma\left(t, S\_{l}\right)^{2}\partial\_{x}^{2}u(t, S\_{l})\right)dt + \sigma\left(t, S\_{l}\right)\partial\_{x}u(t, S\_{l})dW\_{l}\\ &= \left((r\_{l} - h\_{l})S\_{l}\partial\_{x}u(t, S\_{l}) - B'(t, S\_{l}, u(t, S\_{l}))\right)dt + \sigma\left(t, S\_{l}\right)\partial\_{x}u(t, S\_{l})dW\_{l}\\ &= \left((r\_{l} - h\_{l})S\_{l}\frac{Z\_{l}}{\sigma\left(t, S\_{l}\right)} - \pi\_{l}(S\_{l}) - \theta\_{l}(V\_{l}) - \left(f\_{t}(a\_{l} - 1) - \lambda\_{l} - c\_{l}a\_{l}\right)V\_{l}\right)\right)dt + Z\_{l}dW\_{l} \end{split}$$

Hence we found the following:

**Theorem 3** (Solution to the Valuation Equation) *Let St be the solution to Eq. (18) and u*(*t*,*s*) *the classical solution to Eq. (17). Then the process* (*St*, *u*(*t*, *St*), σ (*t*, *St*) ∂*su*(*t*, *St*)) *is the unique solution to Eq. (13).*

*Proof* From the reasoning above we found that(*St*, *u*(*t*, *St*), σ (*t*, *St*)∂*su*(*t*, *St*))solves Eq. (13). Finally from the seminal result of [14] we know that if there exist *K* > 0 and *<sup>p</sup>* <sup>≥</sup> <sup>1</sup> <sup>2</sup> such that:

• |μ(*t*, *x*) − μ(*t*, *x* )|+|σ (*t*, *x*) − σ (*t*, *x* )| ≤ *K*|*x* − *x* |


then the FBSDE (14) has a unique solution. Since we have to check the Lipschitz continuity just for *y* and *z* we can verify that Eq. (13) satisfies the above-mentioned assumptions and hence has a unique solution.

*Remark 4* Since we proved that *Vt* = *u*(*t*, *St*) with *u*(*t*,*s*) ∈ *C*1,2, the reasoning we used, when saying that *H <sup>t</sup>* <sup>=</sup> *St Zt* σ (*t*,*St*) represented choosing a delta-hedge, it is actually more than a heuristic argument.

Moreover, since (17) does not depend on the risk-free rate *rt* so we can state the following:

**Theorem 4** (Invariance Theorem) *If we are under the assumptions at the beginning of Sect.4 and we assume that we are backing our deal with a delta hedging strategy, then the price Vt can be calculated via the semilinear PDE (17) and does* not *depend on the risk-free rate r*(*t*)*.*

This invariance result shows that even when starting from a risk-neutral valuation theory, the risk-free rate disappears from the nonlinear valuation equations. A discussion on consequences of nonlinearity and invariance on valuation in general, on the operational procedures of a bank, on the legitimacy of fully charging the nonlinear value to a client, and on the related dangers of overlapping valuation adjustments is presented elsewhere, see for example [3, 5] and references therein.

**Acknowledgements** The opinions here expressed are solely those of the authors and do not represent in any way those of their employers. We are grateful to Cristin Buescu, Jean-François Chassagneux, François Delarue, and Marek Rutkowski for helpful discussion and suggestions that helped us improve the paper. Marek Rutkowski and Andrea Pallavicini visits were funded via the EPSRC Mathematics Platform grant EP/I019111/1.

The KPMG Center of Excellence in Risk Management is acknowledged for organizing the conference "Challenges in Derivatives Markets - Fixed Income Modeling, Valuation Adjustments, Risk Management, and Regulation".

**Open Access** This chapter is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, duplication, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, a link is provided to the Creative Commons license and any changes made are indicated.

The images or other third party material in this chapter are included in the work's Creative Commons license, unless indicated otherwise in the credit line; if such material is not included in the work's Creative Commons license and the respective action is not permitted by statutory regulation, users will need to obtain permission from the license holder to duplicate, adapt or reproduce the material.

# **References**


# **Nonlinear Monte Carlo Schemes for Counterparty Risk on Credit Derivatives**

**Stéphane Crépey and Tuyet Mai Nguyen**

**Abstract** Two nonlinear Monte Carlo schemes, namely, the linear Monte Carlo expansion with randomization of Fujii and Takahashi (Int J Theor Appl Financ 15(5):1250034(24), 2012 [9], Q J Financ 2(3), 1250015(24), 2012, [10]) and the marked branching diffusion scheme of Henry-Labordère (Risk Mag 25(7), 67–73, 2012, [13]), are compared in terms of applicability and numerical behavior regarding counterparty risk computations on credit derivatives. This is done in two dynamic copula models of portfolio credit risk: the dynamic Gaussian copula model and the model in which default dependence stems from joint defaults. For such highdimensional and nonlinear pricing problems, more standard deterministic or simulation/regression schemes are ruled out by Bellman's "curse of dimensionality" and only purely forward Monte Carlo schemes can be used.

**Keywords** Counterparty risk · Funding · BSDE · Gaussian copula · Marshall–Olkin copula · Particles

# **1 Introduction**

Counterparty risk is a major issue since the global credit crisis and the ongoing European sovereign debt crisis. In a bilateral counterparty risk setup, counterparty risk is valued as the so-called credit valuation adjustment (CVA), for the risk of default of the counterparty, and debt valuation adjustment (DVA), for own default risk. In such a setup, the classical assumption of a locally risk-free funding asset used for both investing and unsecured borrowing is no longer sustainable. The proper accounting of the funding costs of a position leads to the funding valuation adjustment (FVA). Moreover, these adjustments are interdependent and must be computed jointly

S. Crépey (B) · T.M. Nguyen

Laboratoire de Mathématiques et Modélisation, Université d'Évry Val d'Essonne, 91037 Évry Cedex, France e-mail: stephane.crepey@univ-evry.fr

T.M. Nguyen e-mail: tuyetmai.nguyen@univ-evry.fr through a global correction dubbed total valuation adjustment (TVA). The pricing equation for the TVA is nonlinear due to the funding costs. It is posed over a random time interval determined by the first default time of the two counterparties. To deal with the corresponding backward stochastic differential equation (BSDE), a first reduced-form modeling approach has been proposed in Crépey [3], under a rather standard immersion hypothesis between a reference (or market) filtration and the full model filtration progressively enlarged by the default times of the counterparties. This basic immersion setup is fine for standard applications, such as counterparty risk on interest rate derivatives. But it is too restrictive for situations of strong dependence between the underlying exposure and the default risk of the two counterparties, such as counterparty risk on credit derivatives, which involves strong adverse dependence, called wrong-way risk (for some insights of related financial contexts, see Fujii and Takahashi [11], Brigo et al. [2]). For this reason, an extended reduced-form modeling approach has been recently developed in Crépey and Song [4–6]. With credit derivatives, the problem is also very high-dimensional. From a numerical point of view, for high-dimensional nonlinear problems, only purely forward simulation schemes can be used. In Crépey and Song [6], the problem is addressed by the linear Monte Carlo expansion with randomization of Fujii and Takahashi [9, 10].

In the present work, we assess another scheme, namely the marked branching diffusion approach of Henry-Labordère [13], which we compare with the previous one in terms of applicability and numerical behavior. This is done in two dynamic copula models of portfolio credit risk: the dynamic Gaussian copula model and the dynamic Marshall–Olkin model in which default dependence stems from joint defaults.

The paper is organized as follows. Sections 2 and 3 provide a summary of the main pricing and TVA BSDEs that are derived in Crépey and Song [4–6]. Section 4 exposes two nonlinear Monte Carlo schemes that can be considered for solving these in high-dimensional models, such as the portfolio credit models of Sect. 5. Comparative numerics in these models are presented in Sect. 6. Section 7 concludes.

# **2 Prices**

# *2.1 Setup*

We consider a netted portfolio of OTC derivatives between two defaultable counterparties, generally referred to as the contract between a bank, the perspective of which is taken, and its counterparty. After having bought the contract from its counterparty at time 0, the bank sets up a hedging, collateralization (or margining), and funding portfolio. We call the funder of the bank a third party, possibly composed in practice of several entities or devices, insuring funding of the bank's strategy. The funder, assumed default-free for simplicity, plays the role of lender/borrower of last resort after the exhaustion of the internal sources of funding provided to the bank through its hedge and collateral.

For notational simplicity we assume no collateralization. All the numerical considerations, our main focus in this work, can be readily extended to the case of collateralized portfolios using the corresponding developments in Crépey and Song [6]. Likewise, we assume hedging in the simplest sense of replication by the bank and we consider the case of a fully securely funded hedge, so that the cost of the hedge of the bank is exactly reflected by the wealth of its hedging and funding portfolio.

We consider a stochastic basis(Ω, *<sup>G</sup><sup>T</sup>* , *<sup>G</sup>* , <sup>Q</sup>), where *<sup>G</sup>* <sup>=</sup> (*Gt*)*<sup>t</sup>*∈[0,*T*] is interpreted as a risk-neutral pricing model on the primary market of the instruments that are used by the bank for hedging its TVA. The reference filtration *F* is a subfiltration of *G* representing the counterparty risk-free filtration, not carrying any direct information about the defaults of the two counterparties. The relation between these two filtrations will be pointed out in the condition (C) introduced later. We denote by:


# *2.2 Clean Price*

We denote by *P* the reference (or clean) price of the contract ignoring counterparty risk and assuming the position of the bank financed at the risk-free rate *r*, i.e. the *G* conditional expectation of the future contractual cash-flows discounted at the riskfree rate *r*. In particular,

$$
\beta\_t P\_t = \mathbb{E}\_t \left[ \int\_t^{\overline{t}} \beta\_s dD\_s + \beta\_{\overline{t}} P\_{\overline{t}} \right], \forall t \in [0, \overline{\tau}]. \tag{1}
$$

We also define *Qt* <sup>=</sup> *Pt* <sup>+</sup> <sup>1</sup>{*t*=τ<*T*}Δτ , so that *<sup>Q</sup>*<sup>τ</sup> represents the clean value of the contract inclusive of the promised dividend at default (if any) Δτ , which also belongs to the "debt" of the counterparty to the bank (or vice versa depending on the sign of *Q*<sup>τ</sup> ) in case of default of a party. Accordingly, at time τ (if < *T*), the close-out cash-flow of the counterparty to the bank is modeled as

$$\mathcal{R} = \mathbb{1}\_{\{\mathfrak{r} = \mathfrak{r}\_{\mathfrak{r}}\}} \left( R\_c \underline{Q}\_{\mathfrak{r}}^+ - \underline{Q}\_{\mathfrak{r}}^- \right) - \mathbb{1}\_{\{\mathfrak{r} = \mathfrak{r}\_{\mathfrak{b}}\}} \left( R\_b \underline{Q}\_{\mathfrak{r}}^- - \underline{Q}\_{\mathfrak{r}}^+ \right) - \mathbb{1}\_{\{\mathfrak{r} = \mathfrak{r}\_{\mathfrak{r}}\}} \underline{Q}\_{\mathfrak{r}},\tag{2}$$

where *Rb* and *Rc* are the recovery rates of the bank and of the counterparty to each other.

# *2.3 All-Inclusive Price*

Let Π be the all-inclusive price of the contract for the bank, including the cost of counterparty risk and funding costs. Since we assume a securely funded hedge (in the sense of replication) and no collateralization, the amounts invested and funded by the bank at time *t* are respectively given by Π<sup>−</sup> *<sup>t</sup>* and Π<sup>+</sup> *<sup>t</sup>* . The all-inclusive price Π is the discounted conditional expectation of all effective future cash flows including the contractual dividends before τ , the cost of funding the position prior to time τ and the terminal cash flow at time τ . Hence,

$$\beta\_t \Pi\_t = \mathbb{E}\_t \left[ \int\_t^{\vec{\tau}} \beta\_s \mathbb{1}\_{s < \tau} dD\_s - \int\_t^{\vec{\tau}} \beta\_s \bar{\lambda}\_s \Pi\_s^+ ds + \beta\_{\vec{\tau}} \mathbb{1}\_{t < T} \beta \bar{\theta} \right], \tag{3}$$

where λ¯ is the funding spread over *r* of the bank toward the external funder, i.e. the bank borrows cash from its funder at rate *r* + λ¯ (and invests cash at the risk-free rate *r*). Since the right hand side in (3) depends also on Π, (3) is in fact a backward stochastic differential equation (BSDE). Consistent with the no arbitrage principle, the gain process on the hedge is a Q martingale, which explains why it does not appear in (3).

# **3 TVA BSDEs**

The total valuation adjustment (TVA) process Θ is defined as

$$
\Theta = \mathcal{Q} - \Pi. \tag{4}
$$

In this section we review the main TVA BSDEs that are derived in Crépey and Song [4–6]. Three BSDEs are presented. These three equations are essentially equivalent mathematically. However, depending on the underlying model, they are not always amenable to the same numerical schemes or the numerical performance of a given scheme may differ between them.

# *3.1 Full TVA BSDE*

By taking the difference between (1) and (3), we obtain

$$
\beta\_t \Theta\_t = \mathbb{E}\_t \left[ \int\_t^{\overline{\tau}} \beta\_s f v a\_s(\Theta\_s) ds + \beta\_{\overline{t}} \mathbb{1}\_{t \le T} \xi \right], \forall t \in [0, \overline{\tau}], \tag{5}
$$

where *fvat*(ϑ) = λ¯*t*(*Pt* − ϑ)<sup>+</sup> is the funding coefficient and where

$$\xi = Q\_{\mathfrak{r}} - \mathcal{R} = \mathbb{1}\_{\{\mathfrak{r} = \mathfrak{r}\_{\mathfrak{r}}\}} (1 - R\_{\mathfrak{c}}) (P\_{\mathfrak{r}} + \Delta\_{\mathfrak{r}})^{+} - \mathbb{1}\_{\{\mathfrak{r} = \mathfrak{r}\_{\mathfrak{b}}\}} (1 - R\_{\mathfrak{b}}) (P\_{\mathfrak{r}} + \Delta\_{\mathfrak{r}})^{-} \tag{6}$$

is the exposure at default of the bank. Equivalent to (5), the "full TVA BSDE" is written as

$$\Theta\_t = \mathbb{E}\_t \left[ \int\_t^{\overline{\tau}} f\_s(\Theta\_s) ds + \mathbb{1}\_{\tau < T} \xi \right], \ 0 \le t \le \overline{\tau}, \tag{I}$$

for the coefficient *ft*(ϑ) = *fvat*(ϑ) − *rt*ϑ.

# *3.2 Partially Reduced TVA BSDE*

Let ξˆ be a *G* -predictable process, which exists by Corollary 3.23 2 in He et al. [12], such that ξˆ <sup>τ</sup> <sup>=</sup> <sup>E</sup>[<sup>ξ</sup> <sup>|</sup>*G*τ−] on τ < <sup>∞</sup> and let ¯*<sup>f</sup>* be the modified coefficient such that

$$
\bar{f}\_l(\vartheta) + r\_l \vartheta = \underbrace{\varkappa\_l \hat{\xi}\_l}\_{\text{cdva}\_l} + \underbrace{\bar{\lambda}\_l (P\_l - \vartheta)^+}\_{\text{fva}\_l(\vartheta)}.\tag{7}
$$

As easily shown (cf. [4, Lemma 2.2]), the full TVA BSDE (I) can be simplified into the "partially reduced BSDE"

$$\bar{\Theta}\_t = \mathbb{E}\_t \left[ \int\_t^{\bar{\tau}} \bar{f}\_s(\bar{\Theta}\_s) ds \right], \ 0 \le t \le \bar{\tau}, \tag{\Pi}$$

in the sense that if <sup>Θ</sup> solves (I), then <sup>Θ</sup>¯ <sup>=</sup> <sup>Θ</sup>1[0,τ ) solves (II), while if <sup>Θ</sup>¯ solves (II), then the process <sup>Θ</sup> defined as <sup>Θ</sup>¯ before <sup>τ</sup>¯ and Θτ¯ <sup>=</sup> <sup>1</sup>τ<*<sup>T</sup>* <sup>ξ</sup> solves (I). Note that both BSDEs (I) and (II) are (*<sup>G</sup>* , <sup>Q</sup>) BSDEs posed over the random time interval [0, <sup>τ</sup>¯], but with the terminal condition ξ for (I) as opposed to a null terminal condition (and a modified coefficient) for (II).

# *3.3 Fully Reduced TVA BSDE*

Let

$$f\_t(\vartheta) = f\_t(\vartheta) - \chi\_t \vartheta = cd\iota a\_t + f\flat a\_t(\vartheta) - (r\_t + \wp\_t)\vartheta.$$

Assume the following conditions, which are studied in Crépey and Song [4–6]:

#### **Condition (C)**. There exist:


Let <sup>E</sup>*<sup>t</sup>* denote the conditional expectation under <sup>P</sup> given *<sup>F</sup>t*. It is shown in Crépey and Song [4–6]) that the full TVA BSDE (I) is equivalent to the following "fully reduced BSDE":

$$\widetilde{\Theta}\_t = \widetilde{\mathbb{E}}\_t \left[ \int\_t^T \widetilde{f}\_s(\widetilde{\Theta}\_s) ds \right], \quad t \in [0, T], \tag{\text{III}}$$

equivalent in the sense that if <sup>Θ</sup> solves (I), then the "*<sup>F</sup>* optional reduction" <sup>Θ</sup> of <sup>Θ</sup> (*<sup>F</sup>* optional process that coincides with <sup>Θ</sup> before <sup>τ</sup> ) solves (III), while if <sup>Θ</sup> solves (III), then <sup>Θ</sup> <sup>=</sup> <sup>Θ</sup> <sup>1</sup>[0,τ ) <sup>+</sup> <sup>1</sup>[<sup>τ</sup> ]1τ<*<sup>T</sup>* <sup>ξ</sup> solves (I).

Moreover, under mild assumptions (see e.g. Crépey and Song [6, Theorem 4.1]), one can easily check that ¯*ft*(ϑ) in (7) (resp. *ft*(ϑ)) satisfies the classical BSDE monotonicity assumption

$$\left(\overline{f}\_{\mathfrak{t}}(\vartheta) - \overline{f}\_{\mathfrak{t}}(\vartheta')\right)(\vartheta - \vartheta') \le C(\vartheta - \vartheta')^2$$

(and likewise for*<sup>f</sup>* ), for some constant *<sup>C</sup>*. Hence, by classical BSDE results nicely surveyed in Kruse and Popier [14, Sect. 2 (resp. 3)], the partially reduced TVA BSDE (II), hence the equivalent full TVA BSDE (I) (resp. the fully reduced BSDE (III)), is well-posed in the space of (*G* , Q) (resp. (*F*, P)) square integrable solutions, where well-posedness includes existence, uniqueness, comparison and BSDE standard estimates.

# *3.4 Marked Default Time Setup*

In order to be able to compute γ ξˆ in ¯*f* , we assume that τ is endowed with a mark *e* in a finite set *E*, in the sense that

$$
\mathfrak{r} = \min\_{\mathfrak{e} \in E} \mathfrak{r}\_{\mathfrak{e}}, \tag{8}
$$

Nonlinear Monte Carlo Schemes for Counterparty Risk on Credit Derivatives 59

where each τ*<sup>e</sup>* is a stopping time with intensity γ *<sup>e</sup> <sup>t</sup>* such that <sup>Q</sup>(τ*<sup>e</sup>* = <sup>τ</sup>*<sup>e</sup>*) <sup>=</sup> <sup>1</sup>, *<sup>e</sup>* = *<sup>e</sup>* , and

$$
\mathcal{Y}\_{\mathfrak{t}} = \mathcal{Y}\_{\mathfrak{t}-} \vee \sigma(\mathfrak{e}),
$$

where ε = argmin*<sup>e</sup>*∈*<sup>E</sup>*τ*<sup>e</sup>* yields the "identity" of the mark. The role of the mark is to convey some additional information about the default, e.g. to encode wrong-way and gap risk features. The assumption of a finite set *E* in (8) ensures tractability of the setup. In fact, by Lemma 5.1 in Crépey and Song [6], there exists *G* -predictable processes *<sup>P</sup><sup>e</sup> <sup>t</sup>* and <sup>Δ</sup> *<sup>e</sup> <sup>t</sup>* such that

$$P\_{\mathfrak{r}} = \widetilde{P}\_{\mathfrak{r}}^{\mathfrak{c}} \text{ and } \Delta\_{\mathfrak{r}} = \widetilde{\Delta}\_{\mathfrak{r}}^{\mathfrak{c}} \text{ on the event } \{\mathfrak{r} = \mathfrak{r}\_{\mathfrak{c}}\}.$$

Assuming further that τ*<sup>b</sup>* = min*<sup>e</sup>*∈*Eb* τ*<sup>e</sup>* and τ*<sup>c</sup>* = min*<sup>e</sup>*∈*Ec* τ*e*, where *E* = *Eb* ∪ *Ec* (not necessarily a disjoint union), one can then take on [0, τ¯]:

$$\chi\_t \hat{\xi}\_t = (1 - R\_c) \sum\_{e \in E\_c} \chi\_t^{\epsilon} \left( \widetilde{P}\_t^{\epsilon} + \widetilde{\Delta}\_t^{\epsilon} \right)^+ - (1 - R\_b) \sum\_{e \in E\_b} \chi\_t^{\epsilon} \left( \widetilde{P}\_t^{\epsilon} + \widetilde{\Delta}\_t^{\epsilon} \right)^-,$$

where the two terms have clear respective CVA and DVA interpretation. Hence, (7) is rewritten, on [0, τ¯], as

$$\begin{split} \bar{f}\_{t}(\vartheta) + r\_{t}\vartheta &= \underbrace{(1 - R\_{c}) \sum\_{\epsilon \in E\_{c}} \chi\_{t}^{\epsilon} \left(\widetilde{P}\_{t}^{\epsilon} + \widetilde{\Delta}\_{t}^{\epsilon}\right)^{+}}\_{\text{CVA coefficient (ctu)}} - \underbrace{(1 - R\_{b}) \sum\_{\epsilon \in E\_{b}} \chi\_{t}^{\epsilon} \left(\widetilde{P}\_{t}^{\epsilon} + \widetilde{\Delta}\_{t}^{\epsilon}\right)^{-}}\_{\text{DVA coefficient (dtu)}} \\ &+ \underbrace{\bar{\lambda}\_{t} \left(P\_{t} - \vartheta\right)^{+}}\_{\text{FVA coefficient (f)} (f u\_{t}(\vartheta))}. \end{split} \tag{9}$$

If the functions *<sup>P</sup><sup>e</sup> <sup>t</sup>* and <sup>Δ</sup> *<sup>e</sup> <sup>t</sup>* above not only exist, but can be computed explicitly (as will be the case in the concrete models of Sects. 5.1 and 5.2), once stated in a Markov setup where

$$f\_t(\vartheta) = f(t, X\_t, \vartheta), \ t \in [0, T], \tag{10}$$

for some (*G* , Q) jump diffusion *X*, then the partially reduced TVA BSDE (II) can be tackled numerically. Similarly, once stated in a Markov setup where

$$
\widetilde{f}\_t(\vartheta) = \widetilde{f}(t, \widetilde{X}\_t, \vartheta), \ t \in [0, T], \tag{11}
$$

for some (*F*, <sup>P</sup>) jump diffusion *<sup>X</sup>*, then the fully reduced TVA BSDE (III) can be tackled numerically.

# **4 TVA Numerical Schemes**

# *4.1 Linear Approximation*

Our first TVA approximation is obtained replacing Θ*<sup>s</sup>* by 0 in the right hand side of (I), i.e.

$$\Theta\_0 \approx \mathbb{E}\left[\int\_0^\overline{f} f\_s(0)ds + \mathbb{1}\_{\tau \prec T} \xi \right] = \mathbb{E}\left[\int\_0^\overline{\hat{\lambda}} \overline{\lambda}\_s P\_s^+ ds + \mathbb{1}\_{\tau \prec T} \xi \right].\tag{12}$$

We then approximate the TVA by standard Monte-Carlo, with randomization of the integral to reduce the computation time (at the cost of a small increase in the variance). Hence, introducing an exponential time ζ of parameter μ, i.e. a random variable with density φ(*s*) <sup>=</sup> <sup>1</sup>*s*≥<sup>0</sup> <sup>μ</sup>*e*−μ*<sup>s</sup>* , we have

$$\mathbb{E}\left[\int\_0^\ddagger f\_s(0)ds\right] = \mathbb{E}\left[\int\_0^\ddagger \phi(s)\frac{1}{\mu}e^{\mu s}f\_s(0)ds\right] = \mathbb{E}\left[\mathbb{1}\_{\xi < \ddagger}\frac{e^{\mu \xi}}{\mu}f\_\xi(0)\right].\tag{13}$$

We can use the same technic for (II) and (III), which yields:

$$\Theta\_0 = \bar{\Theta}\_0 \approx \mathbb{E}\left[\int\_0^{\bar{\tau}} \bar{f}\_\star(0) ds\right] = \mathbb{E}\left[\mathbb{1}\_{\zeta < \bar{\tau}} \frac{e^{\mu\zeta}}{\mu} \bar{f}\_\zeta(0)\right],\tag{14}$$

$$\Theta\_0 = \tilde{\Theta}\_0 \approx \tilde{\mathbb{E}} \left[ \int\_0^T \tilde{f}\_\epsilon(0) ds \right] = \tilde{\mathbb{E}} \left[ \mathbb{1}\_{\zeta < T} \frac{e^{\mu \zeta}}{\mu} \tilde{f}\_\zeta(0) \right]. \tag{15}$$

# *4.2 Linear Expansion and Interacting Particle Implementation*

Following Fujii and Takahashi [9, 10], we can introduce a perturbation parameter ε and the following perturbed form of the fully reduced BSDE (III):

$$\widetilde{\Theta}\_t^\varepsilon = \widetilde{\mathbb{E}}\_t \left[ \int\_t^T \varepsilon \widetilde{f}\_s(\widetilde{\Theta}\_s^\varepsilon) ds \right], \quad t \in [0, T], \tag{16}$$

where ε = 1 corresponds to the original BSDE (III). Suppose that the solution of (16) can be expanded in a power series of ε:

$$
\widetilde{\Theta}\_t^\varepsilon = \widetilde{\Theta}\_t^{(0)} + \varepsilon \widetilde{\Theta}\_t^{(1)} + \varepsilon^2 \widetilde{\Theta}\_t^{(2)} + \varepsilon^3 \widetilde{\Theta}\_t^{(3)} + \cdots \tag{17}
$$

The Taylor expansion of *<sup>f</sup>* at <sup>Θ</sup>(0) reads

$$\begin{aligned} \widetilde{f}\_{l}(\widetilde{\Theta}^{\varepsilon}\_{l}) &= \widetilde{f}\_{l}(\widetilde{\Theta}^{(0)}\_{l}) + (\varepsilon \widetilde{\Theta}^{(1)}\_{l} + \varepsilon^{2} \widetilde{\Theta}^{(2)}\_{l} + \cdots) \partial\_{\vartheta} \widetilde{f}\_{l}(\widetilde{\Theta}^{(0)}\_{l}) \\ &+ \frac{1}{2} (\varepsilon \widetilde{\Theta}^{(1)}\_{l} + \varepsilon^{2} \widetilde{\Theta}^{(2)}\_{l} + \cdots)^{2} \partial\_{\vartheta^{2}}^{2} \widetilde{f}\_{l}(\widetilde{\Theta}^{(0)}\_{l}) + \cdots \end{aligned}$$

Collecting the terms of the same order with respect to <sup>ε</sup> in (16), we obtain <sup>Θ</sup>(0) *<sup>t</sup>* = 0, due to the null terminal condition of the fully reduced BSDE (III), and

$$\begin{split} \tilde{\Theta}^{(1)}\_{t} &= \tilde{\mathbb{E}}\_{t} \left[ \int\_{t}^{T} \tilde{f}\_{s}(\tilde{\Theta}^{(0)}\_{s}) ds \right], \\ \tilde{\Theta}^{(2)}\_{t} &= \tilde{\mathbb{E}}\_{t} \left[ \int\_{t}^{T} \tilde{\Theta}^{(1)}\_{s} \partial\_{\theta} \tilde{f}\_{s}(\tilde{\Theta}^{(0)}\_{s}) ds \right], \\ \tilde{\Theta}^{(3)}\_{t} &= \tilde{\mathbb{E}}\_{t} \left[ \int\_{t}^{T} \tilde{\Theta}^{(2)}\_{s} \partial\_{\theta} \tilde{f}\_{s}(\tilde{\Theta}^{(0)}\_{s}) ds \right], \end{split} \tag{18}$$

where the third order term should contain another component based on ∂<sup>2</sup> ϑ2 *<sup>f</sup>* . But, in our case, ∂<sup>2</sup> ϑ2 *<sup>f</sup>* involves a Dirac measure via the terms (*Pt* <sup>−</sup> ϑ)<sup>+</sup> in *fvat*(ϑ), so that we truncate the expansion to the term <sup>Θ</sup>(3) *<sup>t</sup>* as above. If the nonlinearity in (III) is sub-dominant, one can expect to obtain a reasonable approximation of the original equation by setting ε = 1 at the end of the calculation, i.e.

$$
\tilde{\Theta}\_0 \approx \tilde{\Theta}\_0^{(1)} + \tilde{\Theta}\_0^{(2)} + \tilde{\Theta}\_0^{(3)}.
$$

Carrying out a Monte Carlo simulation by an Euler scheme for every time *s* in a time grid and integrating to obtain <sup>Θ</sup>(1) <sup>0</sup> would be quite heavy. Moreover, this would become completely unpractical for the higher order terms that involve iterated (multivariate) time integrals. For these reasons, Fujii and Takahashi [10] have introduced a particle interpretation to randomize and compute numerically the integrals in (18), which we call the FT scheme. Let η<sup>1</sup> be the interaction time of a particle drawn independently as the first jump time of a Poisson process with an arbitrary intensity μ > 0 starting from time *t* ≥ 0, i.e., η<sup>1</sup> is a random variable with density

$$\phi(t,s) = \mathbb{1}\_{s \ge t} \, \mu \, e^{-\mu(s-t)}. \tag{19}$$

From the first line in (18), we have

$$\widetilde{\Theta}\_t^{(1)} = \widetilde{\mathbb{E}}\_t \left[ \int\_t^T \phi(t, s) \frac{e^{\mu(s-t)}}{\mu} \widetilde{f}\_t(\widetilde{\Theta}\_s^{(0)}) ds \right] = \widetilde{\mathbb{E}}\_t \left[ \mathbb{1}\_{\eta\_1 < T} \frac{e^{\mu(\eta\_1 - t)}}{\mu} \widetilde{f}\_{\eta\_1}(\widetilde{\Theta}\_{\eta\_1}^{(0)}) \right]. \tag{20}$$

Similarly, the particle representation is available for the higher order. By applying the same procedure as above, we obtain

$$
\widetilde{\Theta}\_t^{(2)} = \widetilde{\mathbb{E}}\_t \left[ \mathbbm{1}\_{\eta\_1 < T} \widetilde{\Theta}\_{\eta\_1}^{(1)} \frac{e^{\mu(\eta\_1 - t)}}{\mu} \partial\_{\vartheta} \widetilde{f}\_{\eta\_1} (\widetilde{\Theta}\_{\eta\_1}^{(0)}) \right],
$$

where <sup>Θ</sup>(1) <sup>η</sup><sup>1</sup> can be computed by (20). Therefore, by using the tower property of conditional expectations, we obtain

$$\widetilde{\Theta}\_t^{(2)} = \widetilde{\mathbb{E}}\_t \left[ \mathbb{1}\_{\eta\_2 < T} \frac{e^{\mu(\eta\_2 - \eta\_1)}}{\mu} \widetilde{f}\_{\eta\_2}(\widetilde{\Theta}\_{\eta\_2}^{(0)}) \frac{e^{\mu(\eta\_1 - t)}}{\mu} \partial\_{\vartheta} \widetilde{f}\_{\eta\_1}(\widetilde{\Theta}\_{\eta\_1}^{(0)}) \right],\tag{21}$$

where η1, η<sup>2</sup> are the two consecutive interaction times of a particle randomly drawn with intensity μ starting from *t*. Similarly, for the third order, we get

$$
\widetilde{\Theta}\_{l}^{(3)} = \widetilde{\mathbb{E}}\_{l} \left[ \mathbbm{1}\_{\eta \sim T} \frac{e^{\mu(\eta \sim \eta\_{2})}}{\mu} \widetilde{f}\_{\eta\_{l}}(\widetilde{\Theta}\_{\eta\_{3}}^{(0)}) \frac{e^{\mu(\eta\_{2} - \eta\_{1})}}{\mu} \partial\_{\vartheta} \widetilde{f}\_{\eta\_{2}}(\widetilde{\Theta}\_{\eta\_{2}}^{(0)}) \frac{e^{\mu(\eta\_{l} - t)}}{\mu} \partial\_{\vartheta} \widetilde{f}\_{\eta\_{l}}(\widetilde{\Theta}\_{\eta\_{1}}^{(0)}) \right], \tag{22}
$$

where η1, η2, η<sup>3</sup> are consecutive interaction times of a particle randomly drawn with intensity μ starting from *t*. In case *t* = 0, (20), (21) and (22) can be simplified as

$$\begin{split} \tilde{\Theta}\_{0}^{(1)} &= \tilde{\mathbb{E}}\left[\mathbbm{1}\_{\zeta\_{1} \gets T} \frac{e^{\mu \boldsymbol{\xi}\_{1}}}{\mu} \tilde{f}\_{\boldsymbol{\xi}\_{1}}(\tilde{\Theta}\_{\zeta\_{1}}^{(0)})\right] \\ \tilde{\Theta}\_{0}^{(2)} &= \tilde{\mathbb{E}}\left[\mathbbm{1}\_{\zeta\_{1} + \zeta\_{2} \gets T} \frac{e^{\mu \boldsymbol{\xi}\_{1}}}{\mu} \vartheta\_{0} \tilde{f}\_{\zeta\_{1}}(\tilde{\Theta}\_{\zeta\_{1}}^{(0)}) \frac{e^{\mu \boldsymbol{\xi}\_{2}}}{\mu} \tilde{f}\_{\zeta\_{1} + \zeta\_{2}}(\tilde{\Theta}\_{\zeta\_{1} + \zeta\_{2}}^{(0)})\right] \\ \tilde{\Theta}\_{0}^{(3)} &= \tilde{\mathbb{E}}\left[\mathbbm{1}\_{\zeta\_{1} + \zeta\_{2} + \zeta\_{3} \gets T} \frac{e^{\mu \boldsymbol{\xi}\_{1}}}{\mu} \vartheta\_{0} \tilde{f}\_{\zeta\_{1}}(\tilde{\Theta}\_{\zeta\_{1}}^{(0)}) \frac{e^{\mu \boldsymbol{\xi}\_{2}}}{\mu} \vartheta\_{0} \tilde{f}\_{\zeta\_{1} + \zeta\_{2}}(\tilde{\Theta}\_{\zeta\_{1} + \zeta\_{2}}^{(0)}) \frac{e^{\mu \boldsymbol{\xi}\_{3}}}{\mu} \tilde{f}\_{\zeta\_{1} + \zeta\_{2} + \zeta\_{3}}(\tilde{\Theta}\_{\zeta\_{1} + \zeta\_{2} + \zeta\_{3}}^{(0)}) \right] \end{split} \tag{23}$$

where ζ1, ζ2, ζ<sup>3</sup> are the elapsed time from the last interaction until the next interaction, which are independent exponential random variables with parameter μ.

Note that the pricing model is originally defined with respect to the full stochastic basis(*G* , Q). Even in the case where there exists a stochastic basis(*F*, Q)satisfying the condition (C), (*F*, Q) simulation may be nontrivial. Lemma 8.1 in Crépey and Song [6] allows us to reformulate the Q expectations in (23) as the following Q expectations, with Θ¯ (0) = 0:

$$\begin{split} \tilde{\Theta}\_{0}^{(1)} &= \bar{\Theta}\_{0}^{(1)} = \mathbb{E}\left[\mathbbm{1}\_{\zeta\_{1} < \bar{\tau}} \frac{e^{\mu \zeta\_{1}}}{\mu} \bar{f}\_{\zeta\_{1}}(\bar{\Theta}\_{\zeta\_{1}}^{(0)})\right] \\ \tilde{\Theta}\_{0}^{(2)} &= \bar{\Theta}\_{0}^{(2)} = \mathbb{E}\left[\mathbbm{1}\_{\zeta\_{1} + \zeta\_{2} < \bar{\tau}} \frac{e^{\mu \zeta\_{1}}}{\mu} \partial\_{\vartheta} \bar{f}\_{\zeta\_{1}}(\bar{\Theta}\_{\zeta\_{1}}^{(0)}) \frac{e^{\mu \zeta\_{2}}}{\mu} \bar{f}\_{\zeta\_{1} + \zeta\_{2}}(\bar{\Theta}\_{\zeta\_{1} + \zeta\_{2}}^{(0)})\right] \\ \tilde{\Theta}\_{0}^{(3)} &= \bar{\Theta}\_{0}^{(3)} = \mathbb{E}\left[\mathbbm{1}\_{\zeta\_{1} + \zeta\_{2} + \zeta\_{3} < \bar{\tau}} \frac{e^{\mu \zeta\_{1}}}{\mu} \partial\_{\vartheta} \bar{f}\_{\zeta\_{1}}(\bar{\Theta}\_{\zeta\_{1}}^{(0)}) \frac{e^{\mu \zeta\_{2}}}{\mu} \partial\_{\vartheta} \bar{f}\_{\zeta\_{1} + \zeta\_{2}}(\bar{\Theta}\_{\zeta\_{1} + \zeta\_{2}}^{(0)})\right] \\ & \qquad \times \frac{e^{\mu \zeta\_{3}}}{\mu} \bar{f}\_{\zeta + \zeta\_{2} + \zeta\_{3}}(\bar{\Theta}\_{\zeta\_{1} + \zeta\_{2} + \zeta\_{3}}^{(0)}), \end{split} (24)$$

which is nothing but the FT scheme applied to the partially reduced BSDE (II). The tractability of the FT schemes (23) and (24) relies on the nullity of the terminal condition of the related BSDEs (III) and (II), which implies that <sup>Θ</sup>¯ (0) <sup>=</sup> <sup>Θ</sup>(0) <sup>=</sup> <sup>0</sup>. By contrast, an FT scheme would not be practical for the full TVA BSDE (5) with terminal condition ξ = 0. Also note that the first order in the FT scheme (23) (resp. (24)) is nothing but the linear approximation (15) (resp. (14)).

# *4.3 Marked Branching Diffusion Approach*

Based on an old idea of McKean [16], the solution *u*(*t*0, *x*0) to a PDE

$$
\partial\_t \mu + \mathcal{L}^\rho \mu + \mu (F(\mu) - \mu) = 0, \quad \mu(T, \mathbf{x}) = \Psi(\mathbf{x}), \tag{25}
$$

where *<sup>L</sup>* is the infinitesimal generator of a strong Markov process *<sup>X</sup>* and *<sup>F</sup>*(*y*) <sup>=</sup> *<sup>d</sup> <sup>k</sup>*=<sup>0</sup> *ak <sup>y</sup><sup>k</sup>* is a polynomial of order *<sup>d</sup>*, admits a probabilistic representation in terms of a random tree *T* (branching diffusion). The tree starts from a single particle ("trunk") born from (*t*0, *x*0). Subsequently, every particle born from a node (*t*, *x*) evolves independently according to the generator *L* of *X* until it dies at time *t* = (*t* + ζ ) in a state *x* , where ζ is an independent μ-exponential time (one for each particle). Moreover, in dying, a particle gives birth to an independent number of *k* new particles starting from the node (*t* , *x* ), where *k* is drawn in the finite set {0, 1,..., *d*} with some fixed probabilities *p*0, *p*1,..., *pd* . The marked branching diffusion probabilistic representation reads

$$\begin{split} u(t\_0, \mathbf{x}\_0) &= \mathbb{E}\_{t\_0, \mathbf{x}\_0} \left[ \prod\_{\text{inner nodes } (t, \mathbf{r}, k) \text{ of } \mathcal{P}} \frac{a\_k}{p\_k} \prod\_{\text{states } \mathbf{x} \text{ of particles alive at } T} \Psi(\mathbf{x}) \right] \\ &= \mathbb{E}\_{t\_0, \mathbf{x}\_0} \left[ \prod\_{k=0}^d \left( \frac{a\_k}{p\_k} \right)^{n\_k} \prod\_{l=1}^v \Psi(\mathbf{x}\_l) \right], \end{split} \tag{26}$$

where *nk* is the number of branching with *k* descendants up on (0, *T*) and ν is the number of particles alive at *T*, with corresponding locations *x*1,..., *x*<sup>ν</sup> .

The marked branching diffusion method of Henry-Labordère [13] for CVA computations, dubbed PHL scheme henceforth, is based on the idea that, by approximating *y*<sup>+</sup> by a well-chosen polynomial *F*(*y*), the solution to the PDE

$$
\partial\_t \mu + \mathcal{L}^\rho \mu + \mu(\mu^+ - \mu) = 0, \quad \mu(T, \mathbf{x}) = \Psi(\mathbf{x}), \tag{27}
$$

can be approximated by the solution to the PDE (25), hence by (26). We want to apply this approach to solve the TVA BSDEs (I), (II) or (III) for which, instead of fixing the approximating polynomial *F*(*y*) once for all in the simulations, we need a state-dependent polynomial approximation to *gt*(*y*) = (*Pt* − *y*)<sup>+</sup> (cf. (7)) in a suitable range for *y*. Moreover, (I) and (II) are BSDEs with random terminal time τ ,¯ equivalently written in a Markov setup as Cauchy–Dirichlet PDE problems, as opposed to the pure Cauchy problem (27). Hence, some adaptation of the method is required. We show how to do it for (II), after which we directly give the algorithm in the similar case of (I) and in the more classical (pure Cauchy) case of (III). Assuming <sup>τ</sup> given in terms of a (*<sup>G</sup>* , <sup>Q</sup>) Markov factor process *<sup>X</sup>* as <sup>τ</sup> <sup>=</sup> inf{*<sup>t</sup>* <sup>&</sup>gt; <sup>0</sup> : *Xt* <sup>∈</sup>/ *<sup>D</sup>*} for some domain *D*, the Cauchy–Dirichlet PDE used for approximating the partially reduced BSDE (II) reads:

$$\begin{aligned} \left(\partial\_t + \omega'\right)\ddot{\boldsymbol{\mu}} + \mu \left(\bar{F}(\bar{\boldsymbol{u}}) - \bar{\boldsymbol{u}}\right) = 0 \text{ on } [0, T] \times \mathcal{O}, \quad \bar{\boldsymbol{u}}(t, \mathbf{x}) = 0 \text{ for } t = T \text{ or } \mathbf{x} \notin \mathcal{O}, \\ \mathbf{x} \end{aligned} \tag{28}$$

where *<sup>A</sup>* is the generator of *<sup>X</sup>* and *<sup>F</sup>*¯*t*,*<sup>x</sup>*(*y*) <sup>=</sup> *<sup>d</sup> <sup>k</sup>*=<sup>0</sup> *<sup>a</sup>*¯ *<sup>k</sup>* (*t*, *<sup>x</sup>*)*y<sup>k</sup>* is such that

$$
\mu(\bar{F}\_{t,x}(\mathbf{y}) - \mathbf{y}) \approx \bar{f}(t, \mathbf{x}, \mathbf{y}),
\text{i.e. } \bar{F}\_{t,x}(\mathbf{y}) \approx \frac{f(t, \mathbf{x}, \mathbf{y})}{\mu} + \mathbf{y}.\tag{29}
$$

Specifically, in view of (9), one can set

$$\bar{F}\_{t,x}(\mathbf{y}) = \frac{1}{\mu} \left( cdva(t, \mathbf{x}) + \bar{\lambda} pol \left( P(t, \mathbf{x}) - \mathbf{y} \right) - r\mathbf{y} \right) + \mathbf{y} = \sum\_{k=0}^{d} \bar{a}\_k(t, \mathbf{x}) \mathbf{y}^k,\tag{30}$$

where *pol*(*r*) is a *d*-order polynomial approximation of *r*<sup>+</sup> in a suitable range for *r*. The marked branching diffusion probabilistic representation of *u*¯(*t*0, *x*0) ∈ *D* involves a random tree *T* made of nodes and "particles" between consecutive nodes as follows. The tree starts from a single particle (trunk) born from the root (*t*0, *x*0). Subsequently, every particle born from a node (*t*, *x*) evolves independently according to the generator *L* of *X* until it dies at time *t* = (*t* + ζ ) in a state *x* , where ζ is an independent μ-exponential time. Moreover, in dying, if its position *x* at time *t* lies in *D*, the particle gives birth to an independent number of *k* new particles starting from the node (*t* , *x* ), where *k* is drawn in the finite set {0, 1,..., *d*} with some fixed probabilities *p*0, *p*1,..., *pd* . Figure 1 describes such a random tree in case *d* = 2. The first particle starts from the root(*t*0, *x*0) and dies at time *t*1, generating two new particles. The first one dies at time *t*<sup>11</sup> and generates a new particle, who dies at time *t*<sup>111</sup> > *T* without descendant. The second one dies at time *t*<sup>12</sup> and generates two new particles, where the first one dies at time *t*<sup>121</sup> without descendant and the second one dies at time *t*<sup>122</sup> outside the domain *D*, hence also without descendant. The blue points represent the inner nodes, the red points the outer nodes and the green points the exit points of the tree out of the time–space domain [0, *T*] × *D*.

The marked branching diffusion probabilistic representation of *u*¯ is written as

$$\bar{u}(t\_0, \mathbf{x}\_0) = \mathbb{E}\_{t \gg \mathbf{0}, \mathbf{x}\_0} \left[ \mathbbm{1} \, \overline{\mathcal{P}} \subset [0, T] \times \mathcal{G} \, \prod\_{\{\text{inner nodes } (t, \mathbf{x}, k) \text{ of } \overline{\mathcal{P}}\}} \frac{\bar{a}\_k(t, \mathbf{x})}{p\_k} \right], \text{ ( $t\_0, \mathbf{x}\_0$ ) \in [0, T] \times \mathcal{G} \,. }\tag{31}$$

**Fig. 1** PHL random tree

E

Note that (31) is unformal at that stage, where we did not justify whether the PDE (28) has a solution *u*¯ and in which sense. In fact, the following result could be used for proving that the function *u*¯ defined in the first line is a viscosity solution to (28).

**Proposition 1** *Denoting by u the function defined by the right hand side in* ¯ (31) *(assuming integrability of the integrand on the domain* [0, *T*] × *D), the process Yt* = *u*¯(*t*, *Xt*), 0 ≤ *t* ≤ ¯τ , *solves the BSDE associated with the Cauchy–Dirichlet PDE* (28)*, namely*

$$Y\_t = \mathbb{E}\_t \left[ \int\_t^{\bar{\tau}} \mu \left( \bar{F}\_{s, X\_s}(Y\_s) - Y\_s \right) ds \right], \quad t \in [0, \bar{\tau}] \tag{32}$$

*(which, in view of* (29)*, approximates the partially reduced BSDE (II), so that Y* ≈ Θ¯ *provided Y is square integrable).*

*Proof* Let (*t*1, *x*1, *k*1) be the first branching point in the tree rooted at (0, *X*0) and let *T <sup>j</sup>* denote *k*<sup>1</sup> independent trees of the same kind rooted at (*t*1, *x*1). By using the independence and the strong Markov property postulated for *X*, we obtain

$$\begin{split} \tilde{\mu}(t,X\_{i}) &= \sum\_{k\_{1}=0}^{d} \mathbb{E}\_{t,X} \left[ \mathbbm{1}\_{I\_{1}\sim T\mathbb{P}k\_{1}} \frac{a\_{k\_{1}}(t\_{1},x\_{1})}{p\_{k\_{1}}} \\ & \times \prod\_{j=1}^{k\_{1}} \mathbb{E}\_{t\_{1},x\_{1}} \left[ \mathbbm{1}\_{\overline{\mathcal{D}\_{f}}\subset[0,T]\times\mathcal{G}} \prod\_{\begin{subarray}{c} (\text{inner node } (t,x)) \text{ of } \overline{\mathcal{D}\_{f}} \end{subarray}} \frac{a\_{k}(s,x)}{p\_{k}} \right] \right] \\ &= \mathbb{E}\_{t,X\_{i}} \left[ \mathbbm{1}\_{I\_{1}\sim T} \sum\_{k\_{1}=0}^{d} a\_{k\_{1}}(t\_{1},x\_{1}) \prod\_{j=1}^{k\_{1}} \mathbb{E}\_{t\_{1},x\_{1}} \left[ \mathbbm{1}\_{\overline{\mathcal{D}\_{f}}\subset[0,T]\times\mathcal{G}} \prod\_{\begin{subarray}{c} (\text{inner node } (t,x,k) \text{ of } \overline{\mathcal{D}\_{f}} \end{subarray}} \frac{a\_{k}(s,x)}{p\_{k}} \right] \right] \\ &= \mathbb{E}\_{t,X\_{i}} \left[ \mathbbm{1}\_{I\_{1}\sim T} \sum\_{k\_{1}=0}^{d} a\_{k\_{1}}(t\_{1},x\_{1}) \prod\_{j=1}^{k\_{1}} \bar{u}(t\_{1},x\_{1}) \right] \end{split}$$

$$\begin{aligned} 0 &= \mathbb{E}\_{\boldsymbol{t}, \boldsymbol{X}\_t} \left[ \mathbb{1}\_{\boldsymbol{t}\_1 < \boldsymbol{T}} \bar{F}\_{\boldsymbol{t}\_1, \boldsymbol{X}\_1} (\bar{u}(t, \boldsymbol{X}\_t^{\boldsymbol{t}\_1, \boldsymbol{X}\_1})) \right] \\ &= \mathbb{E}\_{\boldsymbol{t}, \boldsymbol{X}\_t} \left[ \int\_t^{\bar{\tau}} \mu(s) e^{-\int\_t^{\bar{\tau}} \mu(u) du} \bar{F}\_{\boldsymbol{s}, \boldsymbol{X}\_t^{\boldsymbol{t}, \boldsymbol{x}}} (\bar{u}(s, \boldsymbol{X}\_s^{\boldsymbol{t}, \boldsymbol{X}})) ds \right], \; 0 \le t \le \bar{\tau}, \; \end{aligned}$$

i.e. *Yt* = ¯*u*(*t*, *Xt*) solves (32). -

If 1τ<*<sup>T</sup>* ξ is given as a deterministic function Ψ (τ, *X*<sup>τ</sup> ), then a similar approach (using the same tree *T* ) can be applied to the full BSDE (I) in terms of the Cauchy–Dirichlet PDE

$$\begin{aligned} \left(\partial \mathfrak{z} + \mathscr{L}\right)\mathfrak{u} + \mu \left(F(\mathfrak{u}) - \mathfrak{u}\right) &= 0 \text{ on } [0, T] \times \mathscr{O}, \quad \mathfrak{u}(t, \mathfrak{x}) = \Psi(t, \mathfrak{x}) \text{ for } t = T \text{ or } \mathfrak{x} \notin \mathscr{O}, \\\\ \text{where } F\_{t, \mathfrak{x}}(\mathfrak{y}) &= \sum\_{k=0}^{d} \ \mathfrak{a}\_{k}(t, \mathfrak{x}) \mathfrak{y}^{k} \text{ is such that} \end{aligned} \tag{33}$$

$$\mu(F\_{t,x}(\mathbf{y}) - \mathbf{y}) \approx f(t, x, \mathbf{y}), \text{i.e.} \, F\_{t,x}(\mathbf{y}) \approx \frac{f(t, x, \mathbf{y})}{\mu} + \mathbf{y}.$$

This yields the approximation formula alternative to (31):

$$\Theta\_0 \approx \mathbb{E}\left[\prod\_{\{\text{inner node } (t,x,k) \text{ of } \overline{\mathcal{P}}\}} \frac{a\_k(t, \mathbf{x})}{P\_k} \prod\_{\{\text{exit point } (t,x) \text{ of } \overline{\mathcal{P}}\}} \Psi(t, \mathbf{x})\right],\tag{34}$$

where an exit point of *T* means a point where a branch of the tree leaves for the first time the time–space domain [0, *<sup>T</sup>*] × *<sup>D</sup>*. Last, regarding the (*F*, <sup>Q</sup>) reduced BSDE (III), assuming an (*F*, <sup>Q</sup>) Markov factor process *<sup>X</sup>* with generator *<sup>A</sup>*and domain *D*, we can apply a similar approach in terms of the Cauchy PDE

$$\begin{aligned} \left(\partial\_t + \mathcal{\widetilde{a}}\right)\widetilde{\boldsymbol{\mu}} + \mu \left(\widetilde{\boldsymbol{F}}\_{t,\boldsymbol{x}}(\widetilde{\boldsymbol{u}}) - \widetilde{\boldsymbol{u}}\right) &= 0 \text{ on } [0,T] \times \mathcal{\partial}, \quad \widetilde{\boldsymbol{u}}(t,\boldsymbol{x}) = 0 \text{ for } t = T \text{ or } \boldsymbol{x} \notin \mathcal{\partial}, \\ \text{where } \widetilde{\boldsymbol{F}}\_{t,\boldsymbol{x}}(\mathbf{y}) &= \sum\_{k=0}^{d} \widetilde{a}\_{k}(t,\boldsymbol{x}) \mathbf{y}^{k} \text{ is such that} \end{aligned} \tag{35}$$

$$\mu(\widetilde{F}\_{t,x}(\mathbf{y}) - \mathbf{y}) \approx \widetilde{f}(t, \mathbf{x}, \mathbf{y}), \text{i.e. } \widetilde{F}\_{t,x}(\mathbf{y}) \approx \frac{\widetilde{f}(t, \mathbf{x}, \mathbf{y})}{\mu} + \mathbf{y}.$$

We obtain

$$
\Theta\_0 = \tilde{\Theta}\_0 \approx \tilde{\mathbb{E}} \left[ \mathbf{1}\_{\overline{\mathcal{P}} \subset [0, T] \times \mathcal{P}} \prod\_{\text{inner node } (t, x, k) \text{ of } \overline{\mathcal{P}}} \frac{\tilde{a}\_k(t, \mathbf{x})}{p\_k} \right], \tag{36}
$$

where *<sup>T</sup>* is the branching tree associated with the Cauchy PDE (35) (similar to *<sup>T</sup>* but for the generator *<sup>A</sup>*).

# **5 TVA Models for Credit Derivatives**

Our goal is to apply the above approaches to TVA computations on credit derivatives referencing the names in*N* = {1,..., *n*}, for some positive integer *n*, traded between the bank and the counterparty respectively labeled as −1 and 0. In this section we briefly survey two models of the default times τ*i*, *i* ∈ *N* = {−1, 0, 1,..., *n*}, that will be used for that purpose with τ*<sup>b</sup>* = τ−<sup>1</sup> and τ*<sup>c</sup>* = τ0, namely the dynamic Gaussian copula (DGC) model and the dynamic Marshall–Olkin copula (DMO) model. For more details the reader is referred to [8, Chaps. 7 and 8] and [6, Sects. 6 and 7].

# *5.1 Dynamic Gaussian Copula TVA Model*

#### **5.1.1 Model of Default Times**

Let there be given a function ς (·) with unit *<sup>L</sup>*<sup>2</sup> norm on <sup>R</sup><sup>+</sup> and a multivariate Brownian motion **B** = (*B<sup>i</sup>* )*<sup>i</sup>*∈*<sup>N</sup>* with pairwise constant correlation ρ ≥ 0 in its own completed filtration *B* = (*Bt*)*<sup>t</sup>*≥<sup>0</sup>. For each *i* ∈ *N*, let *hi* be a continuously differentiable increasing function from R<sup>∗</sup> <sup>+</sup> to <sup>R</sup>, with lim0 *hi*(*s*) = −∞ and lim+∞ *hi*(*s*) <sup>=</sup> +∞, and let

$$\pi\_i = h\_i^{-1}(\varepsilon\_i), \text{ where } \varepsilon\_i = \int\_0^{+\infty} \xi(u) dB^i\_u. \tag{37}$$

Thus the (τ*i*)*<sup>i</sup>*∈*<sup>N</sup>* follow the standard Gaussian copula model of Li [15], with correlation parameter ρ and with marginal survival function Φ ◦ *hi* of τ*i*, where Φ is the standard normal survival function. In particular, these τ*<sup>i</sup>* do not intersect each other. In order to make the model dynamic as required by counterparty risk applications, the model filtration *G* is given as the Brownian filtration *B* progressively enlarged by the τ*i*, i.e.

$$\mathcal{H}\_{l} = \mathcal{H}\_{l} \lor \bigvee\_{i \in N} \left( \sigma \left( \tau\_{i} \land t \right) \lor \sigma \left( \{ \tau\_{i} > t \} \right) \right), \forall t \ge 0,\tag{38}$$

and the reference filtration *F* is given as *B* progressively enlarged by the default times of the reference names, i.e.

$$\mathcal{R}\_t = \mathcal{R}\_t \lor \bigvee\_{i \in N^\*} \left( \sigma \left( \mathfrak{r}\_i \land t \right) \lor \sigma \left( \{ \mathfrak{r}\_i > t \} \right) \right), \forall t \ge 0. \tag{39}$$

As shown in Sect. 6.2 of Crépey and Song [6], for the filtrations *G* and *F* as above, there exists a (unique) probability measure P equivalent to Q such that the condition (*C*) holds. For every *i* ∈ *N*, let

$$m\_t^i = \int\_0^t \xi(u) dB\_u^i, \ k\_t^i = \mathfrak{r}\_i \mathbb{1}\_{\{\mathfrak{r}\_i \le t\}},$$

and let **m***<sup>t</sup>* = (*m<sup>i</sup> <sup>t</sup>*)*<sup>i</sup>*∈*<sup>N</sup>*, **k***<sup>t</sup>* = (*k<sup>i</sup> <sup>t</sup>*)*<sup>i</sup>*∈*<sup>N</sup>*, **<sup>k</sup>***<sup>t</sup>* <sup>=</sup> (1*<sup>i</sup>*∈*N <sup>k</sup><sup>i</sup> <sup>t</sup>*)*<sup>i</sup>*∈*<sup>N</sup>*. The couple *Xt* = (**m***t*, **k***t*) (resp. *Xt* <sup>=</sup> (**m***t*,**<sup>k</sup>***t*)) plays the role of a (*<sup>G</sup>* , <sup>Q</sup>) (resp. (*F*, <sup>P</sup>)) Markov factor process in the dynamic Gaussian copula (DGC) model.

#### **5.1.2 TVA Model**

A DGC setup can be used as a TVA model for credit derivatives, with mark *i* = −1, 0 and *Eb* = {−1}, *Ec* = {0}. Since there are no joint defaults in this model, it is harmless to assume that the contract promises no cash-flow at τ , i.e., Δτ = 0, so that *Q*<sup>τ</sup> = *P*<sup>τ</sup> . By [8, Propositions 7.3.1 p. 178 and 7.3.3 p. 181], in the case of vanilla credit derivatives on the reference names, namely CDS contracts and CDO tranches (cf. (47)), there exists a continuous, explicit function *Pi* such that

$$P\_{\tau} = \vec{P}\_{i}(\tau, \mathbf{m}\_{\tau}, \mathbf{k}\_{\tau-}),\tag{40}$$

or *<sup>P</sup><sup>i</sup>* <sup>τ</sup> in a shorthand notation, on the event {τ = τ*i*}. Hence, (9) yields

$$
\bar{f}\_l(\vartheta) + r\_l \vartheta = (1 - R\_c) \chi\_l^0 (\tilde{P}\_l^0)^+ - (1 - R\_b) \chi\_l^{-1} (\tilde{P}\_l^{-1})^- + \bar{\lambda}\_l (P\_l - \vartheta)^+, \quad \forall t \in [0, \bar{\tau}].
$$

Assume that the processes *r* and λ¯ are given before τ as continuous functions of (*t*, *Xt*), which also holds for *P* in the case of vanilla credit derivatives on names in *<sup>N</sup>*. Then the coefficients ¯*<sup>f</sup>* and in turn *<sup>f</sup>* are deterministically given in terms of the corresponding factor processes as

$$
\bar{f}\_t(\vartheta) = \bar{f}(t, X\_t, \vartheta), \\
\bar{f}\_t(\vartheta) = \bar{f}(t, \bar{X}\_t, \vartheta),
$$

so that we are in the Markovian setup where the FT and the PHL schemes are valid and, in principle, applicable.

# *5.2 Dynamic Marshall–Olkin Copula TVA Model*

The above dynamic Gaussian copula model allows dealing with TVA on CDS contracts. But a Gaussian copula dependence structure is not rich enough for ensuring a proper calibration to CDS and CDO quotes at the same time. If CDO tranches are also present in a portfolio, a possible alternative is the following dynamic Marshall–Olkin (DMO) copula model, also known as the "common shock" model.

#### **5.2.1 Model of Default Times**

We define a family *Y* of "shocks", i.e. subsets *Y* ⊆ *N* of obligors, usually consisting of the singletons {−1}, {0}, {1},..., {*n*}, and a few "common shocks" *I*1,*I*2,...,*Im* representing simultaneous defaults. For *Y* ∈ *Y* , the shock time η*<sup>Y</sup>* is defined as an i.i.d. exponential random variable with parameter γ*<sup>Y</sup>* . The default time of obligor *i* in the common shock model is then defined as

$$\pi\_i = \min\_{Y \in \mathcal{Y}, i \in Y} \eta\_Y. \tag{41}$$

*Example 1* Figure 2 shows one possible default path in a common-shock model with *n* = 3 and *Y* = {{−1},{0},{1},{2},{3},{2, 3},{0, 1, 2},{−1, 0}}. The inner oval shows which shocks happened and caused the observed default scenarios at successive default times.

The full model filtration *G* is defined as

$$\mathcal{H}\_t = \bigvee\_{Y \in \mathcal{Y}} \left( \sigma \left( \eta\_Y \wedge t \right) \vee \sigma \left( \{ \eta\_Y > t \} \right) \right), \,\forall t \ge 0.$$

Letting *Y*◦ = {*Y* ∈ *Y* ; −1, 0 ∈/ *Y*}, the reference filtration *F* is given as

$$\mathcal{F}\_t = \bigvee\_{Y \in \mathcal{Y}\_\circ} \left( \sigma \left( \eta\_Y \wedge t \right) \vee \sigma \left( \{ \eta\_Y > t \} \right) \right), \ t \ge 0.$$

**Fig. 2** One possible default path in the common-shock model with *n* = 3 and *Y* = {{−1},{0},{1},{2},{3},{2, 3},{0, 1, 2},{−1, 0}}

As shown in Sect. 7.2 of Crépey and Song [6], in the DMO model with *G* and *F* as above, the condition (C) holds for <sup>P</sup> <sup>=</sup> <sup>Q</sup>. Let *<sup>J</sup><sup>Y</sup>* <sup>=</sup> <sup>1</sup>[0,η*<sup>Y</sup>* ). Similar to (**m**, **<sup>k</sup>**) (resp. (**m**,**<sup>k</sup>**)) in the DGC model, the process

$$X = (J^Y)\_{Y \in \mathcal{Y}} \text{ (resp. } \widetilde{X} = (\mathbb{1}\_{Y \in \mathcal{Y}\_\circ} J^Y)\_{Y \in \mathcal{Y}}) \tag{42}$$

plays the role of a (*G* , Q) (resp. (*F*, Q)) Markov factor in the DMO model.

#### **5.2.2 TVA Model**

A DMO setup can be used as a TVA model for credit derivatives, with

$$E\_b = \vartheta\_b^\diamond := \{ Y \in \vartheta^\diamond ; \ -1 \in Y \}, \\ E\_c = \vartheta\_c^\diamond := \{ Y \in \vartheta^\diamond ; \ 0 \in Y \}, \\ E = \vartheta\_\bullet^\diamond := \vartheta\_b^\diamond \cup \vartheta\_c^\diamond$$

and

$$\mathfrak{r}\_b = \mathfrak{r}\_{-1} = \min\_{Y \in \mathcal{Y}\_b'} \eta\_Y, \ \mathfrak{r}\_c = \mathfrak{r}\_0 = \min\_{Y \in \mathcal{Y}\_c'} \eta\_Y,$$

hence

$$\pi = \min\_{Y \in \mathcal{Y}\_{\bullet}} \eta\_Y, \ \mathcal{Y} = \mathbb{1}\_{[0,\tau)} \widetilde{\mathcal{Y}} \text{ with } \widetilde{\mathcal{Y}} = \sum\_{Y \in \mathcal{Y}\_{\bullet}} \eta\_Y. \tag{43}$$

By [8, Proposition 8.3.1 p. 205], in the case of CDS contracts and CDO tranches, for every shock *Y* ∈ *Y* and process *U* = *P* or Δ, there exists a continuous, explicit function *U <sup>Y</sup>* such that

$$U\_{\mathfrak{r}} = U\_Y(\mathfrak{r}, X\_{\mathfrak{r}-}),\tag{44}$$

or *U Y* <sup>τ</sup> in a shorthand notation, on the event {τ = η*<sup>Y</sup>* }. The coefficient ¯*ft*(ϑ) in (9) is then given, for *t* ∈ [0, τ¯], by

$$\begin{split} \bar{f}\_{t}(\boldsymbol{\vartheta}) + r\_{t}\boldsymbol{\vartheta} &= (1 - R\_{c}) \sum\_{\boldsymbol{Y} \in \mathcal{Y}\_{c}} \boldsymbol{\chi}\_{t}^{\boldsymbol{Y}} (\widetilde{\boldsymbol{P}}\_{t}^{\boldsymbol{Y}} + \widetilde{\boldsymbol{\Delta}}\_{t}^{\boldsymbol{Y}})^{+} - (1 - R\_{b}) \sum\_{\boldsymbol{Y} \in \mathcal{Y}\_{b}} \boldsymbol{\chi}\_{t}^{\boldsymbol{Y}} (\widetilde{\boldsymbol{P}}\_{t}^{\boldsymbol{Y}} + \widetilde{\boldsymbol{\Delta}}\_{t}^{\boldsymbol{Y}})^{-} \\ &+ \bar{\boldsymbol{\lambda}}\_{t} (\boldsymbol{P}\_{t} - \boldsymbol{\vartheta})^{+}. \end{split} \tag{45}$$

Assuming that the processes *r* and λ¯ are given before τ as continuous functions of (*t*, *Xt*), which also holds for *P* in case of vanilla credit derivatives on the reference names, then

$$
\bar{f}\_l(\vartheta) = \bar{f}(t, X\_t, \vartheta), \\
\tilde{f}\_l(\vartheta) = \bar{f}\_l(\vartheta) - \tilde{\jmath}\vartheta = \tilde{f}(t, \tilde{X}\_t, \vartheta) \tag{46}
$$

(cf. (43)), so that we are again in a Markovian setup where the FT and the PHL schemes are valid and, in principle, applicable.

# *5.3 Strong Versus Weak Dynamic Copula Model*

However, one peculiarity of the TVA BSDEs in our credit portfolio models is that, even though full and reduced Markov structures have been identified, which is required for justifying the validity of the FT and/or PHL numerical schemes, and the corresponding generators *<sup>A</sup>* or *<sup>A</sup>*can be written explicitly, the Markov structures are too heavy for being of any practical use in the numerics. Instead, fast and exact simulation and clean pricing schemes are available based on the dynamic copula structures.

Moreover, in the case of the DGC model, we lose the Gaussian copula structure after a branching point in the PHL scheme. In fact, as visible in [8, Formula (7.7) p. 175], the DGC conditional multivariate survival probability function is stated in terms of a ratio of Gaussian survival probability functions, which is explicit but does not simplify into a single Gaussian survival probability function. It is only in the DMO model that the conditional multivariate survival probability function, which arises as a ratio of exponential survival probability functions (see [8, Formula (8.11) p. 197 and Sect. 8.2.1.1]), simplifies into a genuine exponential survival probability function. Hence, the PHL scheme is not applicable in the DGC model.

The FT scheme based on (III) is not practical either because the Gaussian copula structure is only under Q and, again, the (full or reduced) Markov structures are not practical. In the end, the only practical scheme in the DGC model is the FT scheme based on the partially reduced BSDE (II). Eventually, it is only in the DMO model that the FT and the PHL schemes are both practical and can be compared numerically.

# **6 Numerics**

For the numerical implementation, we consider stylized CDS contracts and protection legs of CDO tranches corresponding to dividend processes of the respective form, for 0 ≤ *t* ≤ *T* :

$$\begin{aligned} D\_t^i &= \left( (1 - R\_i) \mathbb{1}\_{t \ge \tau\_i} - S\_i(t \wedge \tau\_i) \right) Norm\_i \\ D\_t &= \left( \left( (1 - R) \sum\_{j \in N} \mathbb{1}\_{t \ge \tau\_j} - (n + 2)a \right)^+ \wedge (n + 2)(b - a) \right) Norm, \end{aligned} \tag{47}$$

where all the recoveries *Ri* and *R* (resp. nominals *Nomi* and *Nom*) are set to 40 % (resp. to 100). The contractual spreads *Si* of the CDS contracts are set such that the corresponding prices are equal to 0 at time 0. Protection legs of CDO tranches, where the attachment and detachment points *a* and *b* are such that 0 ≤ *a* ≤ *b* ≤ 100 %, can also be seen as CDO tranches with upfront payment. Note that credit derivatives traded as swaps or with upfront payment coexist since the crisis. Unless stated otherwise, the following numerical values are used:

$$r = 0, R\_b = 1, R\_c = 40\,\%, \tilde{\lambda} = 100\,\text{bp} = 0.01, \mu = \frac{2}{T}, m = 10^4.$$

# *6.1 Numerical Results in the DGC Model*

First we consider DGC random times τ*<sup>i</sup>* defined by (37), where the function *hi* is chosen so that τ*<sup>i</sup>* follows an exponential distribution with parameter γ*<sup>i</sup>* (which in practice can be calibrated to a related CDS spread or a suitable proxy). More precisely, let Φ and Ψ*<sup>i</sup>* be the survival functions of a standard normal distribution and an exponential distribution with intensity γ*i*. We choose *hi* = Φ−<sup>1</sup> ◦ Ψ*i*, so that (cf. (37))

$$\mathbb{Q}(\mathfrak{r}\_i \succeq t) = \mathbb{Q}\left(\Psi\_i^{-1}\left(\Phi\left(\mathfrak{e}\_i\right)\right) \succeq t\right) = \mathbb{Q}\left(\Phi\left(\mathfrak{e}\_i\right) \le \Psi\_i(t)\right) = \Psi\_i(t),$$

for Φ (ε*i*) has a standard uniform distribution. Moreover, we use a function ς (·) in (37) constant before a time horizon *T*¯ > *T* and null after *T*¯, so that ς (0) = <sup>√</sup> 1 *<sup>T</sup>*¯ (given the constraint that ν<sup>2</sup>(0) = - ∞ <sup>0</sup> ς<sup>2</sup>(*s*)*ds* = 1) and, for *t* ≤ *T*¯,

$$m^2(t) = \int\_t^\infty \zeta^2(s)ds = \frac{\tilde{T} - t}{\bar{T}},\ m^i\_t = \int\_0^t \zeta(u)dB^i\_u = \frac{1}{\sqrt{\tilde{T}}}B^i\_t,\ \int\_0^\infty \zeta(u)dB^i\_u = \frac{1}{\sqrt{\tilde{T}}}B^i\_{\bar{T}}.$$

In the case of the DGC model, the only practical TVA numerical scheme is the FT scheme (24) based on the partially reduced BSDE (II), which can be described by the following steps:


4. Simulate the vector **m***<sup>T</sup>*¯ from the last simulated vector **m***<sup>t</sup>* (*t* = 0 by default) as **m***<sup>t</sup>* + (**m***<sup>T</sup>*¯ − **m***t*), where **m***<sup>T</sup>*¯ − **m***<sup>t</sup>* = ( <sup>√</sup> 1 *<sup>T</sup>*¯ (*B<sup>i</sup> <sup>T</sup>*¯ <sup>−</sup> *<sup>B</sup><sup>i</sup> <sup>t</sup>*))*<sup>i</sup>*∈*<sup>N</sup>* <sup>∼</sup> *<sup>N</sup>* (0, *<sup>T</sup>*¯ <sup>−</sup>*<sup>t</sup> <sup>T</sup>*¯ *In*(1, ρ)). Deduce (*B<sup>i</sup> <sup>T</sup>*¯)*<sup>i</sup>*∈*<sup>N</sup>*, hence <sup>τ</sup>*<sup>i</sup>* <sup>=</sup> <sup>Ψ</sup> <sup>−</sup><sup>1</sup> *<sup>i</sup>* ◦ Φ √ 1 *T*¯ *Bi T*¯ , *i* ∈ *N*, and in turn the vectors **k**ζ<sup>1</sup> (if ζ<sup>1</sup> + ζ<sup>2</sup> + ζ<sup>3</sup> < *T*), **k**ζ1+ζ<sup>2</sup> (if ζ<sup>1</sup> + ζ<sup>2</sup> < *T*) and **k**ζ1+ζ2+ζ<sup>3</sup> (if ζ<sup>1</sup> + ζ<sup>2</sup> + ζ<sup>3</sup> < *T*). 5. Compute ¯*f*ζ<sup>1</sup> , ¯*f*ζ1+ζ<sup>2</sup> , and ¯*f*ζ1+ζ2+ζ<sup>3</sup> for the three orders of the FT scheme.

We perform TVA computations on CDS contracts with maturity *T* = 10 years, choosing for that matter *<sup>T</sup>*¯ <sup>=</sup> *<sup>T</sup>* <sup>+</sup> <sup>1</sup> <sup>=</sup> 11 years, hence <sup>ς</sup> <sup>=</sup> <sup>1</sup> √[<sup>0</sup>,11] <sup>11</sup> , for ρ = 0.6 unless otherwise stated. Table 1 displays the contractual spreads of the CDS contracts used in these experiments. In Fig. 3, the left graph shows the TVA on a CDS on name 1, computed in a DGC model with *n* = 1 by FT scheme of order 1 to 3, for different levels of nonlinearity represented by the value of the unsecured borrowing spread λ¯. The right graph shows similar results regarding a portfolio comprising one CDS contract per name *i* = 1,..., 10. The time-0 clean value of the default leg of the CDS in case *n* = 1, respectively the sum of the ten default legs in case *n* = 10, is 4.52, respectively 40.78 (of course *P*<sup>0</sup> = 0 in both cases by definition of fair contractual spreads). Hence, in relative terms, the TVA numbers visible in Fig. 3 are quite high, much greater for instance than in the cases of counterparty risk on interest rate derivatives considered in Crépey et al. [7]. This is explained by the wrong-way risk feature of the DGC model, namely, the default intensities of the surviving names and the value of the CDS protection spike at defaults in this model. When λ¯ increases (for λ¯ = 0 that's a case of linear TVA where FT higher order terms equal 0), the second (resp. third) FT term may represent in each case up to 5–10% of the first

**Table 1** Time-0 bp CDS spreads of names −1 (the bank), 0 (the counterparty) and of the reference names 1 to *n* used when *n* = 1 (*left*) and *n* = 10 (*right*)

**Fig. 3** *Left* DGC TVA on one CDS computed by FT scheme of order 1–3, for different levels of nonlinearity (unsecured borrowing spread λ¯). *Right* similar results regarding the portfolio of CDS contracts on ten names

**Fig. 4** *Left*TVA on one CDS computed by FT scheme of order 3 as a function of the DGC correlation parameter ρ. *Right* similar results regarding a portfolio of CDS contracts on ten different names

**Fig. 5** *Left* the % relative standard errors of the different orders of the expansions do not explode with the number of names (λ¯ = 100 bp). *Middle* the % relative standard errors of the different orders of the expansions do not explode with the level of nonlinearity represented by the unsecured borrowing spread λ ( ¯ *n* = 1). *Right* since FT terms are computed by purely forward Monte Carlo schemes, their computation times are linear in the number of names (λ¯ = 100 bp)

(resp. second) FT term, from which we conclude that the first FT term can be used as a first order linear estimate of the TVA, with a nonlinear correction that can be estimated by the second FT term.

In Fig. 4, the left graph shows the TVA on one CDS computed by FT scheme of order 3 as a function of the DGC correlation parameter ρ , with other parameters set as before. The right graph shows the analogous results regarding the portfolio of ten CDS contracts. In both cases, the TVA numbers increase (roughly linearly) with ρ , including for high values of ρ , as desirable from the financial interpretation point of view, whereas it has been noted in Brigo and Chourdakis [1] (see the blue curve in Fig. 1 of the ssrn version of the paper) that for high levels of the correlation between names, other models may show some pathological behaviors.

In Fig. 5, the left graph shows that the errors, in the sense of the relative standard errors (% rel. SE), of the different orders of the FT scheme do not explode with the dimension (number of credit names that underlie the CDS contracts). The middle graph, produced with *n* = 1, shows that the errors do not explode with the level of nonlinearity represented by the unsecured borrowing spread λ¯. Consistent with the fact that the successive FT terms are computed by purely forward Monte Carlo schemes, their computation times are essentially linear in the number of names, as visible in the right graph.

To conclude this section, we compare the linear approximation (14) corresponding to the first FT term in (24) (FT1 in Table 2) with the linear approximations (12)– (13) (LA in Table 2). One can see from Table 2 that the LA and FT1 estimates are consistent (at least in the sense of their 95% confidence intervals, which always intersect each other). But the LA standard errors are larger than the FT1 ones. In fact, using the formula for the intensity γ of τ in FT1 can be viewed as a form of variance reduction with respect to LA, where τ is simulated. Of course, for λ¯ = 0 (case of the right tables where λ¯ = 3 %), both linear approximations are biased as compared with the complete FT estimate (with nonlinear correction, also shown in Table 2), particularly in the high dimensional case with 10 CDS contracts (see the bottom panels in Table 2). Figure 6 completes these results by showing the LA, FT1

**Table 2** LA, FT1 and FT estimates: 1 CDS (*top*) and 10 CDSs (*bottom*), with parameters λ¯ = 0 %, ρ = 0.8 (*left*) and λ¯ = 3 %, ρ = 0.6 (*right*)



**Fig. 6** The % relative standard errors of the different schemes do not explode with the level of nonlinearity represented by the unsecured borrowing spread λ¯. *Left* 1 CDS. *Middle* 10 CDSs. *Right* the % relative standard errors of the different schemes (LA, FT1, FT in figures) do not explode with the number of names (λ¯ = 100 bp, ρ = 0.6)

and FT standard errors computed for different levels of nonlinearity and different dimensions.

Summarizing, in the DGC model, the PHL is not practical. The FT scheme based on the partially reduced TVA BSDE (II) gives an efficient way of estimating the TVA. The nonlinear correction with respect to the linear approximations (14) or (15) amounts up to 5% in relative terms, depending on the unsecured borrowing spread λ. ¯

# *6.2 Numerical Results in the DMO Model*

In the DMO model, the FT scheme (18) for the fully reduced BSDE (23) can be implemented through following steps:


We can also consider the PHL scheme (31) based on the partially reduced BSDE (II) with

$$\mathcal{O} = \{ \mathbf{x} = (\mathbf{x}^Y)\_{Y \in \mathcal{Y}} \in \{0, 1\}^{\mathcal{Y}} \text{ such that } \mathbf{x}^Y = 1 \text{ for } Y \in \mathcal{Y}\_\bullet \}.$$

To simulate the random tree *T* in (31), we follow the approach sketched before (31) where, in order to evolve *X* according to the DMO generator *A* during a time interval ζ, a particle born from a node *x* = (*jY* )*<sup>Y</sup>*∈*<sup>Y</sup>* ∈ {0, 1}*<sup>Y</sup>* at time *t*, all one needs is, for each *Y* such that *jY* = 1, draw an independent exponential random variable η*<sup>Y</sup>* of parameter <sup>γ</sup>*<sup>Y</sup>* and then set *<sup>x</sup>* <sup>=</sup> (*jY*1[0,η*<sup>Y</sup>* )(ζ ))*<sup>Y</sup>*∈*<sup>Y</sup>* . Rephrasing in more algorithmic terms:

1. To simulate the random tree *T* under the expectation in (31), we repeat the following step (generation of particles, or segments between consecutive nodes of the tree) until a generation of particles dies without children:

For each node (*t*, *x* = (*jY* )*Y*∈*<sup>Y</sup>* , *k*) issued from the previous generation of particles (starting with the root-node (0, *X*0, *k* = 1)), for each of the *k* new particles, indexed by *l*, issued from that node, simulate an independent exponential random variable ζ*<sup>l</sup>* and set

(*t <sup>l</sup>*, *x <sup>l</sup>*, *k <sup>l</sup>*) <sup>=</sup> (*<sup>t</sup>* <sup>+</sup> <sup>ζ</sup>*l*, (*jY*1[0,η*<sup>l</sup> <sup>Y</sup>* )(ζ*l*))*Y*∈*<sup>Y</sup>* , <sup>1</sup>*x l* <sup>∈</sup>*D*ν*l*),

where, for each *l*, the η*<sup>l</sup> <sup>Y</sup>* are independent exponential-γ*<sup>Y</sup>* random draws and ν*<sup>l</sup>* is an independent draw in the finite set {0, 1,..., *d*} with some fixed probabilities *p*0, *p*1,..., *pd* .

2. To compute the random variable Φ under the expectation in (31), we loop over the nodes of the tree *T* thus constructed (if *T* ⊂ [0, *T*] × *D*, otherwise Φ = 0 in the first place) and we form the product in (31), where the *a*¯ *<sup>k</sup>* (*t*, *x*) are retrieved as in (30).

The PHL schemes (34) based on the full BSDE (I) or (36) based on the fully reduced BSDE (III) can be implemented along similar lines.

We perform TVA computations in a DMO model with *n* = 120, for individual shock intensities taken as γ{*i*} = 10−<sup>4</sup> × (100 + *i*) (increasing from ∼100 bps to 220 bps as *i* increases from 1 to 120) and four nested groups of common shocks *I*<sup>1</sup> ⊂ *I*<sup>2</sup> ⊂ *I*<sup>3</sup> ⊂ *I*4, respectively consisting of the riskiest 3, 9, 21 and 100% (i.e. all) names, with respective shock intensities γ*<sup>I</sup>*<sup>1</sup> = 20 bp, γ*<sup>I</sup>*<sup>2</sup> = 10 bp, γ*<sup>I</sup>*<sup>3</sup> = 6.67 bp and γ*<sup>I</sup>*<sup>4</sup> = 5 bp. The counterparty (resp. the bank) is taken as the eleventh (resp. tenth) safest name in the portfolio. In the model thus specified, we consider CDO tranches with upfront payment, i.e. credit protection bought by the bank from the counterparty at time 0, with nominal 100 for each obligor, maturity *T* = 2 years and attachment (resp. detachment) points are 0, 3 and 14% (resp. 3%, 14% and 100%). The respective value of *P*<sup>0</sup> (upfront payment) for the equity, mezzanine and senior tranche is 229.65, 5.68, and 2.99. Accordingly, the ranges of approximation chosen for *pol*(*y*) ≈ *y*<sup>+</sup> in the respective PHL schemes are 250, 200, and 10. We use polynomial approximation of order *d* = 4 with (*p*0, *p*1, *p*2, *p*3, *p*4) = (0.5, 0.3, 0.1, 0.09, 0.01). We set μ = 0.1 in all PHL schemes and μ = 2/*T* = 0.2 in all FT schemes.

**Fig. 7** TVA on CDO tranches with 120 underlying names computed by FT scheme of order 1–3 for different levels of nonlinearity (unsecured borrowing basis λ¯). *Left* equity tranche. *Middle*mezzanine tranche. *Right* senior tranche. Originally published in Crépey and Song [6]. Published with kind permission of © Springer-Verlag Berlin Heidelberg 2016. All Rights Reserved. This figure is subject to copyright protection and is not covered by a Creative Commmons License

**Table 3** FT, PHL, PHL and-PHL schemes applied to the equity (*top*), mezzanine (*middle*), and senior (*bottom*) tranche, for the parameters λ¯ = 0 %, λ*Ij* = 60*bp*/*j* (*left*) or λ¯ = 3 %, λ*Ij* = 20*bp*/*j* (*right*)




Figure 7 shows the TVA computed by the FT scheme (23) based on the fully reduced BSDE (III), for different levels of nonlinearity (unsecured borrowing basis λ¯). We observe that, in all cases, the third order term is negligible. Hence,

**Fig. 8** Analog of Fig. 5 for the CDO tranche of Fig. 7 in the DMO model (λ¯ = 0.01). Originally published in Crépey and Song [6]. Published with kind permission of © Springer-Verlag Berlin Heidelberg 2016. All Rights Reserved. This figure is subject to copyright protection and is not covered by a Creative Commmons License

**Fig. 9** *Bottom* the % relative standard errors do not explode with the number of names (λ¯ = 100 bp). *Top* the % relative standard errors do not explode with the level of nonlinearity represented by the unsecured borrowing spread λ ( ¯ *<sup>n</sup>* <sup>=</sup> <sup>120</sup>). *Left* FT scheme. *Middle*-PHL scheme. *Right* PHL scheme

in further FT computations, we only compute the orders 1 (linear part) and 2 (nonlinear correction) (Fig. 8). Table 3 compares the results of the above FT scheme (23) based on the fully reduced BSDE (III) with those of the PHL schemes (36) based on (III) again (-PHL in the tables), (31) based on the partially reduced BSDE (II) (PHL in the tables) and (34) based on the full BSDE (I) (PHL in the tables), for the three CDO tranches and two sets of parameters. The three PHL schemes are of course slightly biased, but the first two, based on the BSDEs with null terminal condition (III) or (II), exhibit much less variance than the third one, based on the full BSDE with terminal condition ξ . This is also visible in Fig. 9 (note the different scales of the *y* axes going from left to right in the picture), which also shows that, for any of these schemes, the relative standard errors do not explode with the level of nonlinearity or the number of reference names in the CDO (the results for the PHL scheme are not shown on the figure as very similar to those of the-PHL scheme). In comparing the TVA values on the left and the right hand side of Table 3, we see that the intensities of the common shocks, which play a role similar to the correlation ρ in the DGC model, have a more important impact on the higher tranches (mezzanine and senior tranche), whereas the equity tranche is more sensitive to the level of the unsecured borrowing spread λ¯.

# **7 Conclusion**

Under mild assumptions, three equivalent TVA BSDEs are available. The original "full" BSDE (I) is stated with respect to the full model filtration *G* and the original pricing measure Q. It does not involve the intensity γ of the counterparty first-todefault time τ. The partially reduced BSDE (II) is also stated with respect to (*G* , Q) but it involves both τ and γ . The fully reduced BSDE (III) is stated with respect to a smaller "reference filtration" *F* and it only involves γ. Hence, in principle, the full BSDE (I) should be preferred for models with a "simple" τ whereas the fully reduced BSDE (III) should be preferred for models with a "simple" γ . But, in nonimmersive setups, the fully reduced BSDE (III) is stated with respect to a modified probability measure P. Even though switching from (*G* , Q) to (*F*, P) is transparent in terms of the generator of related Markov factor processes, this can be an issue in situations where the Markov structure is important in the theory to guarantee the validity of the numerical schemes, but is not really practical from an implementation point of view. This is for instance the case with the credit portfolio models that we use for illustrative purposes in our numerics, where the Markov structure that emerges from the dynamic copula model is too heavy and it is only the copula features that can be used in the numerics—copula features under the original stochastic basis (*G* , Q), which do not necessarily hold under a reduced basis (*F*, <sup>P</sup>) (especially when <sup>P</sup> = <sup>Q</sup>). As for the partially reduced BSDE (II), as compared with the full BSDE (I), its interest is its null terminal condition, which is key for the FT scheme as recalled below. But of course (II) can only be used when one has an explicit formula for γ .

For nonlinear and very high-dimensional problems such as counterparty risk on credit derivatives, the only feasible numerical schemes are purely forward simulation schemes, such as the linear Monte Carlo expansion of Fujii and Takahashi [9, 10] or the branching particles scheme of Henry–Labordère [13], respectively dubbed "FT scheme" and "PHL scheme" in the paper. In our setup, the PHL scheme involves a nontrivial and rather sensitive fine-tuning for finding a polynomial in ϑ that approximates the terms (*Pt* − ϑ) <sup>±</sup> in *fvat*(ϑ) in a suitable range for ϑ. This finetuning requires a preliminary knowledge on the solution obtained by running another approximation (linear approximation or FT scheme) in the first place. Another limitation of the PHL scheme in our case is that it is more demanding than the FT scheme in terms of the structural model properties that it requires. Namely, in our credit portfolio problems, both a Markov structure and a dynamic copula are required for the PHL scheme. But, whereas a "weak" dynamic copula structure in the sense of simulation and forward pricing by copula means is sufficient for the FT scheme, a dynamic copula in the stronger sense that the copula structure is preserved in the future is required in the case of the PHL scheme. This strong dynamic copula property is satisfied by our common-shock model but not in the Gaussian copula model. In conclusion, the FT schemes applied to the partially or fully reduced BSDEs (II) or (III) (a null terminal condition is required so that the full BSDE (I) is not eligible for this scheme) appear as the method of choice on these problems.

An important message of the numerics is that, even for realistically high levels of nonlinearity, i.e. an unsecured borrowing spread λ¯ = 3 %, the third order FT correction was always found negligible and the second order FT correction less than 5–10% of the first order, linear FT term. In conclusion, a first order FT term can be used for obtaining "the best linear approximation" to our problem, whereas a nonlinear correction, if wished, can be computed by a second order FT term.

**Acknowledgements** This research benefited from the support of the "Chair Markets in Transition" under the aegis of Louis Bachelier laboratory, a joint initiative of École polytechnique, Université d'Évry Val d'Essonne and Fédération Bancaire Française.

The KPMG Center of Excellence in Risk Management is acknowledged for organizing the conference "Challenges in Derivatives Markets - Fixed Income Modeling, Valuation Adjustments, Risk Management, and Regulation"

**Open Access** This chapter is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, duplication, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, a link is provided to the Creative Commons license and any changes made are indicated.

The images or other third party material in this chapter are included in the work's Creative Commons license, unless indicated otherwise in the credit line; if such material is not included in the work's Creative Commons license and the respective action is not permitted by statutory regulation, users will need to obtain permission from the license holder to duplicate, adapt or reproduce the material.

# **References**


# **Tight Semi-model-free Bounds on (Bilateral) CVA**

**Jördis Helmers, Jan-J. Rückmann and Ralf Werner**

**Abstract** In the last decade, counterparty default risk has experienced an increased interest both by academics as well as practitioners. This was especially motivated by the market turbulences and the financial crises over the past decade which have highlighted the importance of counterparty default risk for uncollateralized derivatives. After a succinct introduction to the topic, it is demonstrated that standard models can be combined to derive semi-model-free tight lower and upper bounds on bilateral CVA (BCVA). It will be shown in detail how these bounds can be easily and efficiently calculated by the solution of two corresponding linear optimization problems.

**Keywords** Counterparty credit risk · CVA · Tight bounds · Mass transportation problem

# **1 Introduction**

Events such as Lehman's default have drawn the attention to counterparty default risk. At the very latest after this default, it has become obvious to all market participants that the credit qualities of both counterparties—usually a client and an investment bank—need to be considered in the pricing of uncollateralized OTC derivatives.

J. Helmers

J.-J. Rückmann

Department of Informatics, University of Bergen, P.O. Box 7803, 5020 Bergen, Norway e-mail: Jan-Joachim.Ruckmann@ii.uib.no

R. Werner (B) Institut für Mathematik, Universität Augsburg, Universitätsstr. 14, 86159 Augsburg, Germany e-mail: ralf.werner@math.uni-augsburg.de

Finbridge GmbH & Co. KG, Louisenstr. 100, 61348 Bad Homburg, Germany e-mail: joerdis.helmers@finbridge.de

Over the past years, several authors have been investigating the pricing of derivatives based on a variety of models which take into account these default risks. Most of these results are covered by a variety of excellent books, for example Pykhtin [16], Gregory [12], or Brigo et al. [7] just to name a few. For a profound discussion on the pros and cons of unilateral versus bilateral counterparty risk let us refer to the two articles by Gregory [11, 13].

In the following exposition, we are concerned with the quantification of the smallest and largest BCVA which can be obtained by any given model with predetermined marginal laws. This takes considerations of Turnbull [21] much further, who first derived weak bounds on CVA for certain types of products. Our approach extends first ideas from Hull and White [15], where the hazard rate determining defaults is coupled to the exposure or other risk factors in either deterministic or stochastic way. Still, Hull and White rely on an explicit choice of the default model and on an explicit coupling. More related is the work by Rosen and Saunders et al. [8, 17], on which we prefer to comment later in Remark 8. As the most related work we note the paper by Cherubini [9] which provided the basis for this semi-model-free approach. There, only one particular two-dimensional copula was used to couple each individual forward swap par rate with the default time. Obviously, a more general approach couples each forward swap par rate with each other and the default time—which is in gist similar to Hull and White [15]. From there the final step to our approach is to observe that the most general approach directly links the whole stochastic evolution of the exposure with both random default times. We will illustrate in the following that these couplings can be readily derived by linear programming. For this purpose the BCVA will be decomposed into three main components: the first component is represented by the loss process, the second component consists of the default indicators of the two counterparties and the third component is comprised of the exposure-at-default of the OTC derivative, i.e. the risk-free present value of the outstanding amount1 at time of default. This approach takes further early considerations of Haase and Werner [14], where comparable results were obtained from the point of view of generalized stopping problems.

In a very recent working paper by Scherer and Schulz [18], the above idea was analyzed in more detail. It was shown that the computational complexity of the problem is the same, no matter if only marginal distributions of defaults or the joint distribution of defaults are known.

After submission of this paper we became aware of related results by Glasserman and Yang, see [10]. Although the main idea of their exposition is similar in gist, Glasserman and Yang focus on the unilateral CVA instead of bilateral CVA. Besides an analysis of the convergence of finite samples to the continuous setup, their exposition is mainly focused on the penalization of deviation from some *base distribution*. In contrast, our focus is on bilateral CVA, with special attention to numerical solution and to the case that payoffs also depend on the credit quality.

<sup>1</sup>In accordance with the *full two-way payment* rule under ISDA master contracts, see e.g. Bielecki and Rutkowski [2] (Sect. 14.4.4), we assume that the close-out value is determined by the then prevailing risk-free present value.

In summary, this exposition makes the following main contributions:


The rest of the paper is organized as follows. In Sect. 2 a succinct introduction to bilateral counterparty risk is given, before the decomposition of the BCVA into its building blocks is carried out in Sect. 3. In Sect. 4 the two main approaches for the calculation of counterparty valuation adjustments are briefly reviewed. Finally, the tight bounds on CVA are derived in Sect. 5, before the paper concludes.

# **2 Counterparty Default Risk**

As usual, to model financial transactions with default risk, let (Ω, *G* , *Gt*, **Q**) be a probability space where *G<sup>t</sup>* models the flow of information and **Q** denotes the riskneutral measure for a given risk-free numéraire process *Nt* > 0, see e.g. Bielecki and Rutkowski [2] for more details. Further, let the space be endowed with a rightcontinuous and complete sub-filtration *F<sup>t</sup>* modeling the flow of information except default, such that *F<sup>t</sup>* ⊆ *G<sup>t</sup>* := *F<sup>t</sup>* ∨ *H<sup>t</sup>* with *H<sup>t</sup>* being the right-continuous filtration generated by the default events.

Subsequently, we consider a transaction with maturity *T* between a client *A* and a counterparty *B* where both are subject to default. The respective random default times are denoted by τ*<sup>A</sup>* and τ*B*. In order to take into account counterparty default risk we distinguish three cases:


For simplicity of presentation, we assume in the following that **Q**[τ*<sup>A</sup>* = *T*] = **Q**[τ*<sup>B</sup>* = *T*] = **Q**[τ*<sup>A</sup>* = τ*B*] = 0. Under this assumption these sets2 yield a decomposition of one, i.e. it holds

$$\mathbf{1}\_{D\_{\varnothing}} + \mathbf{1}\_{D\_{\hbar}} + \mathbf{1}\_{D\_{\varnothing}} = 1 \quad \mathbf{Q}\text{-almost- surely.}$$

In the following, let us consider a transaction consisting of cash flows *C*(*B*, *A*, *Ti*) paid by the counterparty *B* at times *Ti*, *i* = 1,..., *mB*, and cash flows *C*(*A*, *B*, *Tj*) paid by the client *A* at times *Tj*, *j* = 1,..., *mA*. Taking into account default risk of both counterparties, the quantification of the bilateral CVA is summarized in the following well-known theorem, which in essence goes back to Sorensen and Bollier [19].

**Theorem 1** *Conditional on the event* {*t* < min(τ*A*, τ*B*)}*, i.e. no default has occurred until time t, the value V <sup>D</sup> <sup>A</sup>* (*t*, *T*) *of the transaction under consideration of bilateral counterparty risk at time t is given by*

$$V\_A^D(t, T) = V\_A(t, T) - C\text{VA}\_A(t, T) = -\left(V\_B(t, T) - C\text{VA}\_B(t, T)\right) = -V\_B^D(t, T)$$

*where the risk-free present value of the transaction is given as*

$$\begin{aligned} V\_A(t, T) &= \mathbb{E}\left[\sum\_{i=1}^{m\_\mathcal{S}} \frac{N\_t}{N\_{T\_i}} \cdot \mathcal{C}(\mathcal{B}, A, T\_i) \bigg| \mathcal{F}\_t \right] - \mathbb{E}\left[\sum\_{j=1}^{m\_\mathcal{A}} \frac{N\_t}{N\_{T\_j}} \cdot \mathcal{C}(\mathcal{A}, \mathcal{B}, T\_j) \bigg| \mathcal{F}\_t \right] \\ &= -V\_\mathcal{B}(t, T) \end{aligned}$$

*and where the* bilateral counterparty value adjustment *CVAA*(*t*, *T*) *is defined as*

$$\begin{split} \text{CVA}\_{A}(t,T) &:= \mathbb{E}\left[ \left. I\_{D\_{\mathcal{B}}} \cdot \frac{N\_{t}}{N\_{t\_{\mathcal{B}}}} \cdot L\_{\tau\_{\mathcal{B}}}^{\mathcal{B}} \cdot \max(0, V\_{A}(\tau\_{\mathcal{B}}, T)) \right| \mathcal{G}\_{t} \right] \\ &\quad - \mathbb{E}\left[ \left. I\_{D\_{\mathcal{A}}} \cdot \frac{N\_{t}}{N\_{t\_{\mathcal{A}}}} \cdot L\_{\tau\_{\mathcal{A}}}^{\mathcal{A}} \cdot \max(0, V\_{\mathcal{B}}(\tau\_{\mathcal{A}}, T)) \right| \mathcal{G}\_{t} \right] \\ &= - \text{CVA}\_{B}(t, T) . \end{split} \tag{1}$$

Here *L<sup>i</sup> <sup>t</sup>* denotes the random loss (between 0 and 1) of counterparty *i* at time *t*.

*Proof* A proof of Theorem 1 can be found in Bielecki and Rutkowski [2], Formula (14.25) or Brigo and Capponi [4], Proposition 2.1 and Appendix A, respectively.

Based on Theorem 1, the general approach for the calculation of the counterparty risk adjusted value *V <sup>D</sup> <sup>A</sup>* (*t*, *T*) is to determine first the risk-free value *VA*(*t*, *T*) of the transaction. This can be done by any common valuation method for this kind of transaction. In a second step the counterparty value adjustment *CVAA*(*t*, *T*) needs to be determined. So far, two main approaches have emerged in the academic literature, which will be briefly reviewed in Sect. 4.

<sup>2</sup>We note that Brigo et al. (in [4, 6]) use different sets to order the default times, which are in essence reducible to the above three events.

# **3 The Main Building Blocks of CVA**

Subsequently, let us assume that the default times τ*<sup>i</sup>* with *i* ∈ {*A*, *B*} can only take a finite number of values{¯*t*1,...,¯*tK*}in the interval ]0, *T*[. For continuous time models this assumption can be justified by the *default bucketing* approach, which can, for example, be found in Brigo and Chourdakis [5], if *K* is chosen sufficiently large. To be able to separate the default dynamics from the market value dynamics, let us introduce the auxiliary time *s*, *s* ∈ [*t*, *T*] and the discounted market value

$$
\tilde{V}\_i^+(t, \mathbf{s}, T) := \frac{N\_t}{N\_s} \cdot \max(0, V\_i(\mathbf{s}, T)).
$$

Then we can rewrite Eq. (1) as:

$$\begin{split} \text{CVA}\_{A}(t,T) &= \mathbb{E}\left[\sum\_{k=1}^{K} L\_{\tilde{t}\_{k}}^{B} \cdot \mathbf{1}\_{D\_{\tilde{k}}} \cdot \mathbf{1}\_{\tilde{h}}(\mathbf{r}\_{B}) \cdot \tilde{V}\_{A}^{+}(t,\bar{t}\_{k},T) \,|\,\mathcal{G}\_{t} \right] \\ &\quad - \mathbb{E}\left[\sum\_{k=1}^{K} L\_{\tilde{t}\_{k}}^{A} \cdot \mathbf{1}\_{D\_{\tilde{k}}} \cdot \mathbf{1}\_{\tilde{h}}(\mathbf{r}\_{A}) \cdot \tilde{V}\_{B}^{+}(t,\bar{t}\_{k},T) \,|\,\mathcal{G}\_{t} \right]. \end{split} \tag{2}$$

Here, **1***<sup>M</sup>* is the indicator function of the set *M*; if *M* = {*m*} we simply write **1***<sup>m</sup>* instead. Now, collecting all terms relating to the default in the default indicator process δ,

$$
\delta\_k^i := \mathbf{1}\_{D\_l} \cdot \mathbf{1}\_{\bar{l}\_k}(\mathbf{r}\_i),
$$

we can rewrite the BCVA in a more compact manner as

$$\begin{split} \text{CVA}\_{A}(t,T) &= \mathbb{E}\left[\sum\_{k=1}^{K} L\_{\hat{\imath}\_{k}}^{B} \cdot \delta\_{k}^{B} \cdot \widetilde{V}\_{A}^{+}(t,\bar{\imath}\_{k},T) \mid \mathcal{G}\_{t}\right] \\ &\quad - \mathbb{E}\left[\sum\_{k=1}^{K} L\_{\hat{\imath}\_{k}}^{A} \cdot \delta\_{k}^{A} \cdot \widetilde{V}\_{B}^{+}(t,\bar{\imath}\_{k},T) \mid \mathcal{G}\_{t}\right]. \end{split} \tag{3}$$

From Eq. (3) we immediately see that the BCVA at time *t* is composed of six discrete time3 processes:


In this way, we are able to separate the default dynamics δ from the loss process *L* and the exposure process *V* . From this decomposition, it becomes obvious that the BCVA is completely determined by the joint distribution of these six processes.

<sup>3</sup>In the following, we replace the time index ¯*tk* with *k* for notational convenience.

*Remark 1* We note that in general it is even sufficient to model four processes (loss dynamics and market value dynamics) plus a two-dimensional random variable (τ*A*, τ*B*). However, in the case of finitely many default times, it is more convenient to work with the default indicator process instead.

*Remark 2* For simplicity of the subsequent exposition, we assume that the loss process is actually constant and equals 1: *L<sup>i</sup> <sup>t</sup>* = *l <sup>i</sup>* = 1. The theory of the remainder of this exposition is not affected by this simplifying assumption, with one notable exception: the resulting two-dimensional transportation problems will become a multi-dimensional transportation problem which renders its numerical solution more complex, but still feasible.

*Remark 3* As we have noted, the default indicator process can only take a finite number of values in the bucketing approach. More exactly, it holds that the joint (i.e. two-dimensional) default indicator process δ = (δ*<sup>k</sup>* )*<sup>k</sup>*=1,...,*<sup>K</sup>* ∈ R2×*<sup>K</sup>*, defined by

$$
\delta\_k := \begin{pmatrix} \delta\_k^A \\ \delta\_k^B \end{pmatrix}, \quad k = 1, \dots, K,
$$

takes only values in the finite set

$$\mathcal{Y} := \left\{ \boldsymbol{\nu} \in \mathbb{R}^{2 \times K} \mid \boldsymbol{\gamma}\_{i,k} \in \{0, 1\}, \; \sum\_{i,k} \boldsymbol{\gamma}\_{i,k} \le 1 \right\}$$

which has exactly 2*K* + 1 elements. Therefore, the discrete time default indicator process is also a process with a finite state space.

Let us further introduce the joint exposure process in analogy to the above,

$$X\_k := \begin{pmatrix} \tilde{V}^+\_{\mathcal{A}}(\iota, \bar{\iota}\_k, T) \\ \tilde{V}^+\_{\mathcal{B}}(\iota, \bar{\iota}\_k, T) \end{pmatrix}, \quad k = 1, \dots, K.$$

Then it holds

$$\text{CVA}\_A(t, T) = \sum\_{k=1}^{K} \left( \mathbb{E} \left[ \delta\_k^B \cdot X\_k^A \mid \mathcal{G}\_t \right] - \mathbb{E} \left[ \delta\_k^A \cdot X\_k^B \mid \mathcal{G}\_t \right] \right). \tag{4}$$

To avoid technical considerations for brevity of presentation, we prefer to work with discrete processes (i.e. discrete state space) in discrete time. Thus, it may be necessary to discretize the state space of the remaining discounted exposure process. In general, there exist (at least) two different approaches how a suitable discrete state space version of the process *X* could be obtained:

• In the first approach—completely similar to the default bucketing approach the state space R<sup>2</sup>×*<sup>K</sup>* for the joint exposure process *X* is divided into *N* disjoint components. Then *X* is replaced by some representative value on this component (usually an average value) on each of the components, and the probabilities of the discretized process are set in accordance with the original probabilities of each component (cf. the default bucketing approach).

• From a computational and practical point of view, a much more convenient approach relies on Monte Carlo simulation: *N* different scenarios (i.e. realizations) of the process *X* are used instead of the original process. Each realization is assumed to have probability 1/*N*.

For both approaches it is known that they converge at least4 in distribution to the original process, which is sufficient for our purposes. For more details on the convergence, let us refer to the recent working paper by Glasserman and Yang [10].

# **4 Models for Counterparty Risk**

In the last decade two main approaches have emerged in the literature how to model the individual, resp. joint distribution of the processes δ and *X*:


# *4.1 Independence of CVA Components*

Let us assume that the exposure process *X* is independent of the default process δ. Then the expectation inside the summation can be split into two parts:

$$\sum\_{k=1}^{K} \mathbb{E}\left[\,\delta\_k^B \cdot X\_k^A \mid \mathcal{G}\_t\right] = \sum\_{k=1}^{K} \mathbb{E}\left[\,\delta\_k^B \mid \mathcal{G}\_t\right] \cdot \mathbb{E}\left[X\_k^A \mid \mathcal{G}\_t\right]. \tag{5}$$

<sup>4</sup>The Monte Carlo approach converges in distribution due to the Theorem of Glivenko–Cantelli. For state space discretization, if for example conditional expectations are used on each bucket, then convergence is in fact almost surely and in *L*<sup>1</sup> due Lévy's 0–1 law.

It is well known that the expected value

$$\mathbb{E}\left[X\_k^A \mid \mathcal{Y}\_t\right] = \mathbb{E}\left[\tilde{V}\_A^+(t, \bar{t}\_k, T) \Big| \mathcal{Y}\_t\right] = \mathbb{E}\left[\frac{N\_t}{N\_{\bar{t}\_k}} \cdot \max(V\_A(t, \bar{t}\_k, T), 0) \Big| \mathcal{F}\_t\right] \tag{6}$$

matches exactly the price of a call option on the basis transaction at time *t* with strike 0 and exercise time ¯*tk* . The CVA equation can hence be rewritten as

$$\text{CVA}\_A(t, T) = \sum\_{k=1}^{K} \left( \mathbb{E} \left[ \delta\_k^B \mid \mathcal{G}\_t \right] \cdot \mathbb{E} \left[ X\_k^A \mid \mathcal{G}\_t \right] - \mathbb{E} \left[ \delta\_k^A \mid \mathcal{G}\_t \right] \cdot \mathbb{E} \left[ X\_k^B \mid \mathcal{G}\_t \right] \right), \tag{7}$$

and thus the BCVA can be calculated without any further problems as the corresponding default probabilities<sup>5</sup> E δ*B <sup>k</sup>* | *G<sup>t</sup>* = **Q** [ τ*<sup>B</sup>* ∈ Δ*<sup>k</sup>* , τ*<sup>B</sup>* ≤ τ*A*| *Gt*] can be easily computed from any given credit risk model: in order to calculate the probability **Q** [ τ*<sup>B</sup>* ∈ Δ*<sup>k</sup>* , τ*<sup>B</sup>* ≤ τ*A*| *Gt*], the default times τ*<sup>A</sup>* and τ*<sup>B</sup>* together with their dependence structure have to be modeled. One of the most popular models for default times in general are intensity models, as for example described in Bielecki and Rutkowsi [2], Part III.

*Remark 4* It has to be noted that a model with deterministic default intensities plus a suitable copula is sufficient for the arbitrary specification of the joint distribution of default times. Stochastic intensities do not add any value in this context. This is true as long as the default risk-free discounted present value is independent of the credit quality of each counterpart. This means that the payoff itself is not allowed to be linked explicitly to the credit quality of any counterparty.

*Remark 5* Let us point out that the intensity model is just one specific example how default times could be modeled. The big advantage of our approach is that any arbitrary credit risk model can be used instead, as only the distribution of the default indicator δ finally matters. In case only marginal default models are available, we can also take into account the remaining unknown dependence between the default times, however, at the price of a higher dimensional transportation problem.

# *4.2 Modeling Options on the Basis Transaction*

Since it could be observed in Eq. (6) that options on the basis transaction need to be priced, a suitable model for this option pricing task needs to be available. Depending on the type of derivative, any model which can be reasonably well calibrated to the market data is sufficient. For instance, for interest rate derivatives, any model ranging from a simple Vasicek or CIR model to sophisticated Libor market models or two-factor Hull–White models could be applied. In case of a credit default swap,

<sup>5</sup>With <sup>Δ</sup>*<sup>k</sup>* :=]¯*tk*−1,¯*tk* ] if the default bucketing approach has been used, otherwise <sup>Δ</sup>*<sup>k</sup>* := {¯*tk* }.

any model which allows to price CDS options, i.e. any model with stochastic credit spread would be feasible. However, for CVA calculations, usually a trade-off between accuracy of the model and efficiency of calculations needs to be made. For this reason, usually simpler models are applied for CVA calculations than for other pricing applications. It needs to be noted that since the financial market usually provides sufficiently many prices of liquid derivatives, any reasonable model can be calibrated to these market prices, and therefore, we can assume in the following that the market implied distribution of the discounted exposure process is fully known and available.

# *4.3 Hybrid Models—An Example*

Another way to calculate the CVA is to use a so-called *hybrid approach* which models all the involved underlying risk factors. Instances of such models can for example be found in Brigo and Capponi [4] for the case of a credit default swap, or Brigo et al. [6] for interest rate derivatives. In Brigo et al. [6], an integrated framework is introduced, where a two-factor Gaussian interest-rate model is set up for a variety of interest rate derivatives<sup>6</sup> in order to deal with the option inherent in the CVA. Further, to model the possible default of the client and its counterparty their stochastic default intensities are given as CIR processes with exponentially distributed positive jumps. The Brownian motions driving those risk factors are assumed to be correlated. Additionally, the defaults of the client and the counterparty are linked by a Gaussian copula.

In summary, the amount of wrong-way risk which can be modeled within such a framework strongly depends on the model choice. If solely correlations between default intensities (i.e. credit spreads) and interest rates are taken into account, only a rather weak relation will emerge between default and the exposure of interest rate derivatives, cf. Brigo et al. [6]. Figure 5 in Scherer and Schulz [18] provides an overview of potential CVA values for different models which illustrates that models can differ quite significantly.

# **5 Tight Bounds on CVA**

From the previous section it becomes obvious that hybrid models yield different CVAs depending on the (model and parameter implied) degree of dependence between default and exposure. However, it remains unclear how large the impact of this dependence can be. In other words: *Is it possible to quantify, how small or large the CVA can get for any model, given that the marginal distributions for expo-*

<sup>6</sup>Although this modeling approach is a rather general one, it has to be noted that it links the dependence on tenors of swaption volatilities to the form of the initial yield curve. Therefore, the limits of such an approach became apparent as the yield curve steepened in conjunction with a movement of the volatility surface in the aftermath of the beginning of financial crisis in 2008, when these effects could not be reproduced by such a model.

*sure and default are already given?* In the following, we want to address this question based on our initially given decomposition of the CVA in building blocks.

As mentioned in Sect. 4.2, we can reasonably assume that the distribution of the exposure process *X* is already completely determined by the available market information. In a similar manner, we have argued that also the distribution of the default indicator process δ can be assumed to be given by the market. Nevertheless, let us point out that the following ideas and concepts could indeed be generalized to the case that only the marginal distributions of the default times are known. Further, we can even consider the case that the dependence structure between different market risk factors is not known but remains uncertain. However, all these generalizations come at the price that the resulting two-dimensional transportation problem will become multi-dimensional.

For the above reasons, we argue that the following approach is indeed *semi-modelfree* in the sense that no model needs to be specified which links the default indicator process with the discounted exposure processes.

# *5.1 Tight Bounds on CVA by Mass Transportation*

Let us reconsider Eq. (4) and let us highlight the dependence of the BCVA on the measure **P**.

$$CVA\_A^{\mathbf{P}}(t,T) = \sum\_{k=1}^{K} \left( \mathbb{E}\_{\mathbf{P}} \left[ \, \delta\_k^B \cdot X\_k^A \mid \mathcal{G}\_t \right] - \mathbb{E}\_{\mathbf{P}} \left[ \, \delta\_k^A \cdot X\_k^B \mid \mathcal{G}\_t \right] \right).$$

With some abuse of notation, the measure **P** denotes the joint distribution of the default process δ and the exposure process*X*. Since both processes have finite support, **P** can be represented as a (2*K* + 1) × *N* matrix with entries in [0, 1]. We note that the marginals of **P**, i.e. the distributions of δ and *X* (denoted by the probability vectors **p**(*X*) ∈ R*<sup>N</sup>* and **p**(δ) ∈ R<sup>2</sup>*K*+1) are already predetermined from the market. Therefore, **P** has to satisfy

$$\mathbf{1}^\top \mathbf{P} = \mathbf{p}^{(\mathcal{X})}, \quad \text{and} \quad \mathbf{P1} = \mathbf{p}^{(\delta)}.$$

*Remark 6* In case of independence between δ and *X*, **P** is given by the product distribution of δ and*X*, whereas in hybrid models the joint distribution**P** is determined by the specification and parametrization of the hybrid model. In the independent case, **P** is hence given by the dyadic product

$$\mathbf{P} = \mathbf{p}^{(\delta)} \mathbf{p}^{(X)^\top}.$$

Obviously, the smallest and largest CVA which can be obtained by any **P** which is consistent with the given marginals, is given by

$$\begin{aligned} \mathit{CVA}\_{A}^{l}(t,T) &:= \min\_{\mathbf{P} \in \mathcal{P}} \mathit{CVA}\_{A}^{\mathbf{P}}(t,T), \\ \mathit{CVA}\_{A}^{u}(t,T) &:= \max\_{\mathbf{P} \in \mathcal{P}} \mathit{CVA}\_{A}^{\mathbf{P}}(t,T), \end{aligned}$$

where

$$\beta^{\mathcal{J}} := \{ \mathbf{P} \in [0, 1]^{(2K+1) \times N} \mid \mathbf{1}^{\top} \mathbf{P} = \mathbf{p}^{(X)}, \ \mathbf{P1} = \mathbf{p}^{(\delta)} \}.$$

It can be easily noted that the set *P* is a convex polytope. Thus, the computation of *CVAl <sup>A</sup>*(*t*, *T*) and *CVAu <sup>A</sup>*(*t*, *T*) essentially requires the solution of a linear program, as the objective functions are linear in **P**.

*Remark 7* The structure of the above LPs coincides with the structure of socalled *balanced linear transportation problems*. Transportation problems constitute a very important subclass of linear programming problems, see for example Bazaraa et al. [1], Chap. 10, for more details. There exist several very efficient algorithms for the numerical solution of such transportation problems, see also Bazaraa et al. [1], Chaps. 10, 11 and 12.

Let us summarize our results in the following theorem:

**Theorem 2** *Under the given prerequisites, it holds:*


The tightness of our bounds is in contrast to Turnbull [21], where only weak bounds were derived. Of course, bounds always represent a best-case and a worst-case estimate only, which may strongly under- and overestimate the true CVA.

*Remark 8* We note that a related approach of coupling default and exposure via copulas was presented by Rosen and Saunders [17] and Crepedes et al. [8]. However, their approach differs from ours in some significant aspects. First, exposure scenarios are sorted by a single number (e.g. effective exposure) to be able to couple exposure scenarios with risk factors of defaults by copulas. Second, risk factors of some credit risk model are employed instead of working with the default indicator directly. Third, their approach is restricted to the real-world setting and does not consider restrictions on the marginal distributions in the coupling process, which is e.g. necessary if stochastic credit spreads should be considered.

# *5.2 An Alternative Formulation as Assignment Problem*

For the above setup we have assumed that the probabilities for all possible realizations of the default indicator process could be precomputed from a suitable default model. If for some default model this should not be the case, but only scenarios (with repeated outcomes for the default indicator) could be obtained by a simulation, an alternative LP formulation could be obtained. In such a scenario setting, it is advisable that for both Monte Carlo simulations, the same number *N* of scenarios is chosen. Then for both given marginal distributions we have **p**(δ) *<sup>j</sup>* <sup>=</sup> **<sup>p</sup>**(*X*) *<sup>i</sup>* = 1/*N*. If we apply the same arguments as above we obtain again a transportation problem, however, with probabilites 1/*N* each. If we have a closer look at this problem, we see that the optimization actually runs over all *N* × *N* permutation matrices—since each default scenario is mapped onto exactly one exposure scenario. This means that this problem eventually belongs to the class of assignment problems, for which very efficient algorithms are available, cf. Bazaraa et al. [1]. Nevertheless, please note that although assignment problems can be solved more efficiently than transportation problems, it is still advisable to solve the transportation problem due to its lower dimensionality, as usually 2*K* + 1 *N* (i.e. time discretization is usually much coarser than exposure discretization). However, if stochastic credit spreads have to be considered, they have to be part of the default simulation and thus assignment problems (with additional linear constraints to guarantee consistency of exposure paths and spreads) become unavoidable.

# **6 Example**

# *6.1 Setup*

To illustrate these semi-model-free CVA bounds let us give a brief example. For this purpose let us consider a standard payer swap with a remaining lifetime of *T* = 4 years analyzed within a Cox–Ingersoll–Ross (CIR) model at time *t* = 0. The time interval ]0, 4[ is split up into *K* = 8 disjoint time intervals each covering half a year. For simplicity, the loss process is again assumed to be 1.

#### **6.1.1 Counterparty's Default Modeling**

To model the defaults we have chosen the well-known copula approach with constant intensities using the Gaussian copula. For further analyses in this example we will focus on the case of uncorrelated counterparties (ρ = 0) and highly correlated counterparties (ρ = 0.9). Furthermore, the counterpartys' default intensities are assumed to be deterministic. We will distinguish between symmetric counterparties with identical default intensities and asymmetric counterparties. Thus, four different settings

**Fig. 1** Probabilities <sup>E</sup>**Q**[ <sup>δ</sup>*<sup>i</sup> <sup>k</sup>* ] in % for Case 1 to Case 4

result: Fig. <sup>1</sup> shows the probabilities **<sup>Q</sup>**[ <sup>δ</sup>*<sup>i</sup> <sup>k</sup>* <sup>=</sup> <sup>1</sup>] = <sup>E</sup>**Q**[ <sup>δ</sup>*<sup>i</sup> <sup>k</sup>* ] in each of the four cases under the risk-neutral measure **Q** implied from the market. To be in line with the following figures, the probabilities for a default of counterparty *B* in Δ*<sup>k</sup>* , i.e. E**Q**[ δ*<sup>B</sup> k* ], correspond to the positive bars and defaults of counterparty *A* to the negative bars. The left plots show identical counterparties (cases 1 and 2) and the right ones the cases, where counterparty *B* has a higher default intensity (cases 3 and 4). Furthermore, the upper plots correspond to uncorrelated defaults and for the ones below we have ρ = 0.9.


**Table 1** E**<sup>Q</sup>** *XA k* and E**<sup>Q</sup>** *XB k* in basis points


#### **6.1.2 Counterparty Exposure Modeling**

As already mentioned, a simple CIR model is applied for the valuation of the payer swap. Since our focus is on the coupling of the default and the exposure model, we have opted for such a simple model for ease of presentation. In the CIR model, the short rate *rt* follows the stochastic differential equation

$$dr\_t = \kappa(\theta - r\_t)dt + \sigma\sqrt{r\_t}dW\_t$$

where (*Wt*)*<sup>t</sup>*≥<sup>0</sup> denotes a standard Brownian motion. Instead of calibrating the parameters to market data (yield curve plus selected swaption prices) on one specific day, we have set the parameters in the following way

$$\kappa = 0.0156, \quad \theta = 0.0311, \quad \sigma = 0.0313, \quad r\_0 = 0.0301$$

to obtain an interest rate market which is typical for the last years. Considering now the discounted exposure of each counterparty within the discrete time framework of our example, we can easily compute E**<sup>Q</sup>** *Xi k* as the average of all generated scenarios from a Monte Carlo simulation. Figure 2 illustrates the results of a simulation, which are also given in Table 1. Positive bars correspond to E**<sup>Q</sup>** *XA k* , negative bars to E**<sup>Q</sup>** *XB k* , and the small bars correspond to E**Q**[*V*˜ *<sup>A</sup>*(*tk* , *T*)]. Since payer and receiver swap are not completely symmetric instruments, there remains a residual expectation, as can be observed from Fig. 2.

E**Q** *XA k* , E**<sup>Q</sup>** *XB k* and

<sup>E</sup>**Q**[*V*˜

**Fig. 2** Expected exposures

*<sup>A</sup>*(*tk* , *T*)]

**Fig. 3** Minimal and maximal *CVAA*, <sup>E</sup>**Q***<sup>i</sup>* δ*B <sup>k</sup>* · *<sup>X</sup><sup>A</sup> <sup>k</sup>* | *G<sup>t</sup>* and <sup>−</sup>E**Q***<sup>i</sup>* δ*A <sup>k</sup>* · *<sup>X</sup><sup>B</sup> <sup>k</sup>* | *G<sup>t</sup>* in bps

# *6.2 Results*

In case of independence between default and exposure, the bilateral CVA is easily obtained by multiplying the default probabilities (as shown in Fig. 1) with the corresponding exposures (as shown in Fig. 2) and summation. Besides the independent *CVAi* , the minimal and maximal *CVA<sup>l</sup>* and *CVAu* have been calculated as well.

The results of these calculations are illustrated in Fig. 3 and Table 2 for each time intervalΔ*<sup>k</sup>* . Analogously to Fig. 1 we have for each of the four cases a separate subplot and the left plots belong again to cases 1 and 2. The positive bars now correspond to E**<sup>Q</sup>***<sup>i</sup>* δ*B <sup>k</sup>* · *X<sup>A</sup> k* and the negative ones to E**<sup>Q</sup>***<sup>i</sup>* δ*A <sup>k</sup>* · *X<sup>B</sup> k* . In the case of the minimal *CVA*, E**<sup>Q</sup>***<sup>l</sup>* δ*B <sup>k</sup>* · *X<sup>A</sup> k* vanishes, meaning that for counterparty *A* in case of a default of counterparty *B* the exposure is zero, as the present value of the swap at that time is negative from counterparty *A*'s point of view. Contrarily, for the maximal *CVA*, E**<sup>Q</sup>***<sup>u</sup>* δ*A <sup>k</sup>* · *X<sup>B</sup> k* is zero. Here, **Q***u*, **Q***l*, and **Q***<sup>i</sup>* denote the optimal measures for the maximal, the minimal, and the independent CVA, respectively. As expected there




**Table 3** Computation times for the two-dimensional transportation problem

are large gaps between the lower and the independent CVA, as well as between the independent CVA and the upper bound. This means that wrong-way risk (i.e. higher exposure comes with higher default rates) can have a significant impact on the bilateral CVA. Interestingly, this observation holds true for all four cases, of course, with different significance depending on the specific setup. Although it is clear that our analysis naturally shows more extreme gaps than any hybrid model, it has to be mentioned that these bounds are indeed tight.

# *6.3 Computation Time, Choice of Algorithm, and Impact of Assumptions*

Theoretically, the computation of the bounds boils down to the solution of a linear programming problem. From this it can be expected that state-of-the-art solvers like CPLEX or Gurobi will yield the optimal solution within reasonable computation time. Using CPLEX, we have obtained the following computation times on a standard workstation (Table 3).

It can be observed that the problem can be solved for reasonable discretization levels within decent time. Rather similar computation times have been obtained with an individual implementation of the standard network simplex based on Fibonacci heaps. However, for larger sizes, the performance of standard solvers begins to deteriorate. To dampen the explosion of computation time, we have resorted to a special purpose solver for min cost network flows (which are a general case of the transportation problem) for highly asymmetric problems, as in our case 2*K* + 1 *N*. Based on Brenner's min cost flow algorithm, see Brenner [3], we could still solve problems with *K* = 40 and *N* = 8192 beneath a minute.

If one has to resort to the assignment formulation (to consider credit spreads accordingly), computation times increase due to the fact that now assignment problems have to be solved. Here, a factor 100 compared to the above computation times cannot be avoided.

If the coupling of the two default times is left flexible, the problem becomes a transportation problem with three margins, i.e. of size*K* + 1 × *K* + 1 × *N*. For these types of problems, no special purpose solver is available and one has to resort to CPLEX. Scherer and Schultz [18] have exploited the structure of this three-dimensional transportation problem to reduce computational complexity. They were able to reduce the problem to a standard two-dimensional transportation problem, hence rendering the computation of bounds similarly easy, no matter if default times are already coupled or not.

# **7 Conclusion and Outlook**

In this paper we have shown how tight bounds on unilateral and bilateral counterparty valuation adjustment can be derived by a linear programming approach. This approach has the advantage that simulations of the uncertain loss, of the default times and of the uncertain value of a transaction during her remaining life can be completely separated. Although we have restricted the exposition to the case of two counterparties and one derivative transaction, the model can easily be extended to more counterparties and a whole netting node of trades. Further, as exposure is simulated separately from default, all risk-mitigating components like CSAs, rating triggers, and netting agreements can be easily included in a such a framework.

Interesting open questions for future research include the analogous treatment in continuous time, which requires much more technically involved arguments. Further, this approach yields a new motivation to consider efficient algorithms for transportation or assignment problems with more than two marginals, which did not yet get much attention so far.

**Acknowledgements** The KPMG Center of Excellence in Risk Management is acknowledged for organizing the conference "Challenges in Derivatives Markets - Fixed Income Modeling, Valuation Adjustments, Risk Management, and Regulation".

**Open Access** This chapter is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, duplication, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, a link is provided to the Creative Commons license and any changes made are indicated.

The images or other third party material in this chapter are included in the work's Creative Commons license, unless indicated otherwise in the credit line; if such material is not included in the work's Creative Commons license and the respective action is not permitted by statutory regulation, users will need to obtain permission from the license holder to duplicate, adapt or reproduce the material.

# **References**


# **CVA with Wrong-Way Risk in the Presence of Early Exercise**

**Roberto Baviera, Gaetano La Bua and Paolo Pellicioli**

**Abstract** Hull–White approach of CVA with embedded WWR (Hull and White, Financ. Anal. J. 68:58-69, 2012, [11]) can be easily applied also to portfolios of derivatives with early termination features. The tree-based approach described in Baviera et al. (Int. J. Financ. Eng. 2015, [1]) allows to deal with American or Bermudan options in a straightforward way. Extensive numerical results highlight the nontrivial impact of early exercise on CVA.

**Keywords** American and Bermudan options·Wrong-way risk ·Credit value adjustment

# **1 Introduction**

As a direct consequence of the 2008 financial turmoil, counterparty credit risk has become substantial in OTC derivatives transactions. In particular, the credit value adjustment (CVA) is meant to measure the impact of counterparty riskiness on a derivative portfolio value as requested by the current Basel III regulatory framework. Accounting standards (IFRS 13, FAS 157), moreover, require a CVA1 adjustment as part of a consistent fair value measurement of financial instruments.

CVA is strongly affected by derivative transaction arrangements: exposure depends on collateral and netting agreement between the two counterparties that have written

R. Baviera (B) · G. La Bua Department of Mathematics, Politecnico di Milano, 32 Piazza Leonardo da Vinci, 20133 Milano, Italy e-mail: roberto.baviera@polimi.it

G. La Bua e-mail: gaetano.labua@polimi.it

<sup>1</sup>Even if in this paper we focus on CVA pricing, it is worthwhile to note that accounting standards ask also for a debt value adjustment (DVA) to take into account the own credit risk.

P. Pellicioli Intesa Sanpaolo Vita S.p.A., 55/57 Viale Stelvio, 20159 Milano, Italy e-mail: paolo.pellicioli.guest@intesasanpaolovita.it

the derivative contracts of interest. Despite the increased use of collateral, however, a significant portion of OTC derivatives remains uncollateralized. This is mainly due to the nature of the counterparties involved, such as corporates and sovereigns, without the liquidity and operational capacity to adhere to daily collateral calls. In such cases, an institution must consider the impact of counterparty risk on the overall portfolio value and a correct CVA quantification acquires even more importance. Extensive literature has been produced on the topic in recent years, as for example [5] and [9] that give a comprehensive overview of CVA computation and the more general topic of counterparty credit risk management. It seems, however, that attention has been mainly paid to CVA with respect to portfolios of European-style derivatives. Dealing with derivatives with early exercise features is even more delicate. Indeed, as pointed out in [3], for American- and Bermudan-style derivatives CVA computation becomes path-dependent since we need to take into account the exercise strategy and the fact that exposure falls to zero after the exercise.

A peculiar problem that we encounter in CVA computation is the presence of the so-called wrong-way risk (WWR), that is the non-negligible dependency between the value of a derivatives portfolio and counterparty default probability. In particular we face WWR if a deterioration in counterparty creditworthiness is more likely when portfolio exposure increases. Several attempts have been made to deal with WWR. From a regulatory point of view, the Basel III Committee currently requires to correct by a multiplicative factor α = 1.4 the CVA computed under hypothesis of marketcredit independence. In this way the impact of WWR is considered equivalent to a 40% increase in standard CVA. However, the Committee leaves room for financial institutions with approved models to apply for lower multipliers (floored at 1.2). This opportunity opens the way for more sophisticated models in order to reach a more efficient risk capital allocation.

Relevant contributions on alternative approaches to manageWWR include copulabased modeling as in [6], introduction of jumps at default as in [13], the backward stochastic differential equations framework developed in [7], and the stochastic hazard rate approach in [11]. In particular [11] introduces the idea to link the counterparty hazard rate to the portfolio value by means of an arbitrary monotone function. The dependence structure is, then, described uniquely by one parameter that controls the impact of exposures on the hazard rate. Additionally, a deterministic time-dependent function is introduced to match the counterparty credit term structure observed on the market. In this framework CVA pricing in the presence of WWR involves just a small adjustment to the pricing machinery already in place in financial institutions. We only need to take into account the randomness incorporated into the counterparty default probabilities by means of the stochastic hazard rate and price CVA with standard techniques. This is probably the most relevant property of the model: as soon as we associate a WWR parameter to a given counterparty–portfolio combination, we are able to deal with WWR using the same pricing engine underlying standard CVA computation. As pointed out in [14], leveraging as much as possible on existing platforms should be one of the principles an optimal risk model should be shaped on. However, the original approach in [11] relies on a Monte Carlo-based technique to determine the auxiliary deterministic function in order to calibrate the model on the counterparty credit structure. Obtaining this auxiliary function is the trickiest part in the calibration procedure, because it involves a "delicate" path-dependent problem that is difficult to implement for realistic portfolios. In [1], it is shown how it is possible to overcome such a limitation by transforming the path-dependent problem into a recursive one with a considerable reduction in the overall computational complexity. The basic idea is to consider discrete market factor dynamics and induce a change of probability such that the new set of (transition) probabilities are computed recursively in time. We presented a straightforward implementation of our approach via tree methods. Trees are also a straightforward and well understood tool to manage the early termination in derivatives pricing. So combining tree-based dynamic programing and the recursive algorithm in [1] leads to a simple and effective procedure to price CVA with WWR when American or Bermudan features are considered. The paper is organized as follows: in Sect. 2 we review the Hull–White model for CVA in the presence of WWR and the recursive approach in [1]. In Sect. 3 we analyze the effects of early termination on CVA adjustments via numerical tests and in Sect. 4 we study the relevant case of a long position on a Bermudan swaption. Finally Sect. 5 reports some final remarks.

# **2 CVA Pricing and WWR**

For a given derivatives portfolio we can define the unilateral CVA<sup>2</sup> as the risk-neutral expectation of the discounted loss that can be suffered over a given period of time

$$CVA = (1 - R) \int\_{t\_0}^{T} B(t\_0, t) \, EE(t) \, PD(dt),\tag{1}$$

where usually *t*<sup>0</sup> is the value date (hereinafter we set *t*<sup>0</sup> = 0 if not stated otherwise) and *T* is the longest maturity date in the portfolio. Here *R* is the recovery rate, *PD*(*dt*) is the probability density of counterparty default between *t* and *t* + *dt* (with no default before *t*), and *B*(*t*0, *t*)*EE*(*t*) is the discounted expected exposure in *t*. If interest rates are stochastic, the expected exposure is defined

$$B(t\_0, t)\ EE(t) \equiv \mathbb{E}[D(t\_0, t)\ E(t)],$$

with E[·] the expectation operator given the information at value date *t*0, *D*(*t*0, *t*) the stochastic discount, and *E*(*t*) the (stochastic) exposure at time *t*. The latter is inherently defined by the collateral agreement that the parties have in place: for example in uncollateralized transactions, *E*(*t*)is simply the max w.r.t. zero of *v*(*t*), the portfolio value at time *t*. For practical computation, the integral in (1) is approximated

<sup>2</sup>The party that carries out the valuation is thus considered default-free. Even if it is a restrictive assumption, unilateral CVA is the only relevant quantity for regulatory and accounting purposes. For a detailed discussion on other forms of CVA, see e.g. [9].

by choosing a discretized set of times *T* = {*ti*}*<sup>i</sup>*=0,...,*<sup>n</sup>* with *tn* = *T*. In particular, the Basel III standard approach for CVA valuation is

$$\text{CVA} = (1 - R) \sum\_{i=1}^{n} \frac{B\_i \ E E\_i + B\_{i-1} \ E E\_{i-1}}{2} P D\_i,\tag{2}$$

with *Bi* that stands for<sup>3</sup> *B*(*t*0, *ti*) and

$$PD\_i \equiv SP\_{i-1} - SP\_i,$$

where *SPi* is the counterparty survival probability up to *ti*. Assuming that the default is modeled by means of a generic intensity-based model, we can link survival probabilities to the so-called hazard rate function *h*(*t*), (see e.g. [15]):

$$SP\_i = \exp\left(-\int\_{t\_0}^{t\_i} h(t) \, dt\right).$$

A common assumption is to consider *h*(*t*) constant between two consecutive dates in the set *T* . Pricing CVA with (2) holds if there is no "market-credit" dependency. However, in case of wrong-way risk (WWR) a new, more sophisticated, model is needed because exposure and counterparty default probabilities are no more independent: exposure is conditional to default and a positive "market-credit" dependence originates the WWR. Recently Hull and White [11] have proposed an approach to WWR that is financially intuitive: the conditional hazard rate is modeled as a stochastic quantity related to the portfolio value *v*(*t*) through a monotonic increasing function. In the following we focus on the specific functional form

$$\tilde{h}(t) = \exp\left(a(t) + b\,\nu(t)\right),\tag{3}$$

where *b* ∈ + is the WWR parameter. However, results still hold for an arbitrary order-preserving function. The function *a*(*t*) is a deterministic function of time, chosen in such a way that on each date

$$\text{s.} \space SP\_i = \mathbb{E}\left[\exp\left(-\int\_{t\_0}^{t\_i} \tilde{h}(t) \, dt\right)\right] \quad \forall i = 1, \ldots, n. \tag{4}$$

Combining (3) and (4) we clearly see that function *a*(*t*) depends also on the value specified for the parameter *b*.

The main advantage of this model is that once we know *b* and *a*(*t*), WWR can be implemented easily by means of a simple generalization of (2):

<sup>3</sup>From now on we use the notation *xi* to represent a discrete-time variable while *x*(*t*) indicates its analogous variable in continuous-time. For avoidance of doubt, any other form of dependency (·) does not refer to the temporal one, unless stated otherwise.

CVA with Wrong-Way Risk … 107

$$CVA\_W = (1 - R) \sum\_{i=1}^{n} \mathbb{E} \left[ \frac{D\_i \ E\_i + D\_{i-1} \ E\_{i-1}}{2} \widetilde{P} \widetilde{D}\_i \right],\tag{5}$$

where *PD<sup>i</sup>* is the stochastic probability to default between *ti*−<sup>1</sup> and *ti* defined in terms of *h*˜*i*. We want to stress that expectation in (5) can be computed via any feasible numerical method: this fact implies that, given *b* and *a*(*t*), taking into account WWR just requires a slight modification in the payoff of existing algorithms used for the calculation of CVA.

We now briefly recall the recursive approach presented in [1] that avoids the path dependency in the determination of *a*(*t*) so that Eq. (4) is satisfied. Hereinafter we refer to the technique to get such a function as either the calibration of *a*(*t*) or the "calibration problem": once the three sets of parameters (the recovery *R*, the default probabilities *PD*s, and the WWR parameter *b*) for dealer's clients are estimated (e.g. with statistical methods) it is the most complicated issue in the calibration of Hull–White model.

Let us assume that the market risk factors underlying the portfolio are discrete and we indicate with *ji* the discrete state variable that describes the market at time *ti*. In this framework market dynamics is described by a Markov chain with

$$q\_i(j\_{i-1}, j\_i) \quad \forall i = 1, \ldots, n$$

the transition probability between *ji*−<sup>1</sup> at time *ti*−<sup>1</sup> and *ji* at time *ti*. Typical examples where such a discrete approach is natural are lattice models. In particular, in [1], we applied tree methods to the pricing of CVA for linear derivatives portfolios.

Embedding the Hull–White model (3) in our setting, the stochastic survival probability between *ti*−<sup>1</sup> and *ti* becomes

$$\tilde{P}\_i(j) \equiv \exp\left(-(t\_i - t\_{i-1})\,\,\tilde{h}\_i(j)\right) \equiv P\_i\,\,\eta\_i(j) \qquad \forall i = 1, \ldots, n,\tag{6}$$

where

$$P\_i \equiv \frac{SP\_i}{SP\_{i-1}}$$

is the forward survival probability between *ti*−<sup>1</sup> and *ti* valued in *t*0. For notational convenience, we also set *P*˜ <sup>0</sup>(*j*0) = η0(*j*0) = 1. The η process introduced in (6) can be seen as the driver of the stochasticity in survival probabilities and it plays a key role in circumventing path-dependency in the calibration of *a*(*t*), as shown in the following proposition.

#### **Proposition**

In the model with discrete market risk factors, the calibration problem (4) becomes

$$\sum\_{j\_i} p\_i(j\_i)\,\eta\_i(j\_i) = 1 \quad \forall i = 1, \ldots, n,\tag{7}$$

where *pi*(*ji*) are probabilities and they can be obtained via the recursive equation

$$p\_i(j\_i) = \sum\_{j\_{i-1}} q\_i(j\_{i-1}, j\_i) \ \eta\_{i-1}(j\_{i-1}) \ p\_{i-1}(j\_{i-1}) \ \quad \forall i = 1, \ldots, n,\tag{8}$$

with the initial condition *p*0(*j*<sup>0</sup> = 0) = 1.

*Proof* See [1].

Thus the calibration problem (4) can be solved at each discrete date *ti* via (7) by simply exploiting the fact that the process η, non-path-dependent, is a martingale under the probability measure *p*. Equation (8), in addition, specifies an algorithm to build this new probability measure recursively. In this framework *PD<sup>i</sup>* can be readily obtained from (6). Let us mention that, although this is just one of the viable approaches to solve (4), it turns out to be, as shown in the next section, a natural way to handle the additional complexity induced by early exercises within the Hull–White approach to WWR modeling.

# **3 The Impact of Early Exercise**

As already anticipated in Sect. 1, CVA when early exercise is allowed gives rise to additional features. In this section we want to highlight the differences in CVA figures when both European and American options are considered, implementing the treebased procedure described in the previous section. It is well known that backward induction and dynamic programing applied on (recombining) trees are, probably, the simplest and most intuitive tool to price derivatives with an early exercise as American options. For these options, indeed, Monte Carlo techniques turn out to be computationally intensive in case of CVA: the exercise date, after which the exposure falls to zero, depends on the path of the underlying asset and on the exercise strategy. In such a case we are asked to describe two random times: the optimal exercise time and the counterparty default time.

# *3.1 The Pricing Problem*

Since our goal is to study the effects of early exercise clauses on CVA, we focus on the case of a dealer that enters into a long position<sup>4</sup> on American-style derivatives with a defaultable counterparty. That is, the dealer is the holder of the option and she has the opportunity to choose the optimal exercise strategy in order to maximize the option value. In particular, following [3], we would need to differentiate between two possible assumptions depending on the effects of counterparty defaultability on

<sup>4</sup>A short option position does not produce any potential CVA exposure.

the exercise strategy. The option holder would or would not take into account the possibility of counterparty default when she chooses whether to exercise or not. In the former case, the continuation value (the value of holding the option until the next exercise date) should be adjusted for the possibility of default. However, following the actual practice in CVA computation, we assume that counterparty defaultability plays no role in defining the exercise strategy of the dealer. This means that the pricing problem (before any CVA consideration) is the classical one for American options in a default-free world.

Let us assume to have a tree for the evolution of market risk factors<sup>5</sup> up to time *T*. Hereinafter, without loss of generality, we can set a constant time step Δ*t* and denote the time partition on the tree by means of an index *i* in *T* = {*ti*}*i*=0,...,*<sup>n</sup>* with *ti* = *i* Δ*t*. We further introduce an arbitrary set of *m* exercise dates *E* = {*ek* }*k*=1,...,*<sup>m</sup>* with *E* ⊆ *T* at which the holder can exercise her rights receiving a payoff φ*<sup>k</sup>* that could depend on the specific exercise date *ek* . In this setting we can deal indistinctly with European (*m* = 1), Bermudan (*m* ∈ N), and American options (*m* → ∞). The standard dynamic programing approach then allows us to compute the derivative value at each node of the tree:

$$\upsilon\_{i}(j\_{i}) = \begin{cases} \phi\_{m}(j\_{i}) & \text{for } i \text{ s.t.} \, t\_{i} = e\_{m} = T, \\ \max(c\_{i}(j\_{i}), \, \phi\_{k}(j\_{i})) & \text{for } i \text{ s.t.} \, t\_{i} \in \mathcal{\delta}^{\circ} \backslash \{e\_{m}\}, \\ c\_{i}(j\_{i}) & \text{otherwise.} \end{cases} \tag{9}$$

with *ci* the continuation value of the derivative defined as

$$c\_i(j\_i) = B(i, i+1; j\_i) \sum\_{j\_{i+1}} q\_i(j\_i, j\_{i+1}) \,\text{v}\_{i+1}(j\_{i+1}),\tag{10}$$

where the sum must be considered over all possible *ti*+1-nodes connected to *ji* at time *ti* and *B*(*i*, *i* + 1;*ji*) is the future discount factor that applies from *ti* and *ti*+<sup>1</sup> possibly depending on the state variable *ji* on the tree.

We describe in detail the simple 1-dimensional tree; however, extensions to the 2 factor case (as, for example, the G2++ model in [4] or the recent dual curve approach in [12]) are straightforward. Once the derivative value is computed for all nodes and the WWR parameter *b* is specified,<sup>6</sup> we can calibrate the auxiliary function *a*(*t*) in (3) by means of the recursive approach in [1]. The advantages of such an approach are, in this case, twofold: we avoid path-dependency in the calibration of *a*(*t*), as in any other possible application, and we deal with early exercises via (9) and (10) in a very intuitive way.

<sup>5</sup>If we describe the dynamics of the price of a corporate stock, we assume—for the sake of simplicity—that such entity is not subject to default risk.

<sup>6</sup>We refer the interested reader to the original paper [11] for a heuristic approach to determine the parameter and to [14] for comprehensive numerical tests with market data.

# *3.2 The Plain Vanilla Case*

We now want to assess the impact of early termination on CVA in order to understand the potential differences that could arise between European and American options from a counterparty credit risk management perspective.

In the first test we study the plain vanilla option case: we assume that the dealer buys a call option from a defaultable counterparty. Counterparty default probabilities are described in terms of a CDS flat curve at 125 basis points as in [11]. More precisely, with a flat CDS curve we can approximate quite well the survival probability between *t*<sup>0</sup> and *ti* as

$$SP\_i = \exp\left(-\frac{s\_i \ t\_i}{1 - R}\right),$$

where *si* is the credit spread relative to maturity *ti* and *R* the recovery rate, equal to 40%. We further assume that trades are fully uncollateralized.7 The underlying asset is lognormally distributed and represented by means of a Cox-Ross-Rubinstein binomial tree. We can thus apply the dynamic programing approach described above to price options on the tree and calibrate the function *a*(*t*) recursively via (7). This procedure turns out to be quite fast: the Matlab coded algorithm takes less than 0.1 second to run on a 3.06 GHz desktop PC with 4 GB RAM when *n* = *m* = 500. Figure 1 shows CVA profile8 for both European and American call options as function of WWR parameter *b* and for different levels of cost of carry. From standard nonarbitrage arguments, we indeed know that the optimality of early exercise for plain vanilla call options is related to the cost of carry (defined as the net cost of holding positions in the underlying asset).<sup>9</sup>

As shown in Fig. 1, CVA profiles are significantly different for European and American options when early exercise can represent the optimal strategy (black and dark gray lines). In particular the impact of WWR is significantly less pronounced for American options compared to the corresponding European ones. On the other hand, when early exercise is no more optimal, the two options are equivalent: light gray lines in Fig. 1 are undistinguishable from each other. In addition, the upward shift in CVA exposures is due to the fact that an increase in cost of carry (e.g. a reduction in the dividend yield) is reflected entirely in an augmented drift of the underlying asset dynamics that makes, *ceteris paribus*, the call option more valuable.

The effect of early exercise on exposure profiles is depicted in Fig. 2 where a possible underlying asset path is displayed along with the optimal exercise boundary

<sup>7</sup>Here we are interested in analysing the full exposure profile as function of early exercise opportunities. On the other hand, more realistic collateralization schemes can be taken into account in a straightforward manner within the described framework.

<sup>8</sup>Once *b* and *a*(*t*) are determined we can use whatever numerical technique to compute (5). Here we simply implement a simulation-based scheme that uses the tree as discretization grid. The number of generated paths is 105.

<sup>9</sup>The classical example is an option written on a dividend paying stock. This frame includes also a call option on a commodity whose forward curve is in backwardation or on a currency pair for which the interest rate of the base currency is higher than the one of the reference currency.

**Fig. 1** CVA profiles for European and American options as function of WWR parameter *b* for several levels of cost of carry (CoC). Parameters are *S*<sup>0</sup> = 100, *K* = 100, σ = 25 %, *r* = 1 %, *T* = 1, *n* = *m* = 500. Counterparty CDS *curve flat* at 125 basis points

**Fig. 2** The effect of early exercise on exposures. Parameters are *S*<sup>0</sup> = 100, *K* = 100, σ = 25 %, *r* = 1 %, CoC = −2 %, *T* = 1, *n* = *m* = 500. *Left hand scale* Asset path (*black solid line*) and optimal exercise boundary (*dashed line*). *Right hand scale* European option (*light gray line*) and American option (*dark gray line*)

(reconstructed on the binomial tree) and the corresponding value of European and American options. Until the asset value remains within the continuation region (the area below the dashed line), the two options have a similar value with the only difference given by the early exercise premium embedded in the American style derivative. However, if the asset value reaches or crosses the exercise boundary, the exposure due to the American option falls to zero while the European option remains alive until maturity. From the definition of CVA (1), we can see that early exercise, if optimal, reduces the exposure of the holder to the counterparty default by shortening the life of the option. The effect is even more pronounced when we introduce the WWR: early redemption, indeed, would occur as soon as the portfolio value is large enough with the consequence to eliminate the exposure just when counterparty

**Fig. 3** Difference in CVA between European and American options as function of WWR parameter *b* and moneyness. Parameters are *S*<sup>0</sup> = 100, σ = 25 %, *r* = 1 %, CoC = −2 %, *T* = 1, *n* = *m* = 500. Counterparty CDS *curve flat* at 125 basis points

default probabilities become more relevant. It is possible, then, to identify in the early termination clause an important mechanism that limits CVA charges, particularly when market-credit dependency is non-negligible as shown in [8] in the case without WWR. Any change that makes early exercise more likely tends to enhance such a mechanism. We see this effect in Fig. 3 where we display the difference in CVA between European and American options as function of WWR parameter and option moneyness. With a given underlying asset dynamics, potential early exercise date is closer for more in the money options: the right of the holder is more likely to be exercised sooner. This shortens the life of the option and reduces both CVA charge (with respect to European options) and WWR sensitivity (with respect to the corresponding European option and the American options with lower moneyness). In this section we have shown that WWR can play a very different role for European and American options. In our opinion, however, WWR should be analyzed on a caseby-case basis in order to determine its magnitude and the adequate capital charge: a 40% increase in standard CVA could overestimate the losses for an American option that can be optimally exercised in a short period while could be reductive in cases where early termination is less likely.

# **4 The Bermudan Swaption Case**

Probably the most relevant case of long position on options with early exercise opportunities in the portfolios of financial institutions is represented by Bermudan swaptions. Such exotic derivatives are, indeed, used by corporate entities to enhance the financial structure related to the issue of callable bonds. Often, by selling a Bermudan receiver swaption to a dealer, the callable bond issuer can reduce its net borrowing cost. Usually the swaption is structured such that exercise dates match

**Table 1** Diagonal implied volatility of European ATM swaptions used to calibrate the 1-factor Hull–White model


Calibrated parameters are *a*ˆ = 0.0146 and σˆ = 0.0089

the callability schedule of the bond.<sup>10</sup> Let *T*ˆ be the bond maturity date. The dealer has the right, at any exercise date *ek* ∈ *E* \{*em*}, to enter into an interest rate swap with maturity *T*ˆ, where she receives the fixed rate *K* (equal to the fixed coupon rate of the bond) and pays the floating rate to the bond issuer with first payment made on date *ek*+1. In our test we use the Euro interbank market data as of September 13, 2012 as given in [2]. We assume that the dealer buys a 10-year Bermudan receiver swaption where the underlying swap has, for simplicity, both fixed and floating legs with semiannual payments. The swaption can be exercised semiannually and its notional amount is Eur 100 million. We describe interest rates dynamics with a 1-factor Extended Vasicek model on a trinomial tree as in [10]. Model parameters are calibrated to market prices of European ATM swaptions with overall contract maturity equal to 10 years as shown in Table 1. As done in the previous section, we value the Bermudan swaption on the tree via dynamic programing and calibrate the WWR model function *a*(*t*). Once again the combined approach on the tree allows to perform both tasks in a negligible amount of time. Figure 4 reports the WWR impact<sup>11</sup> for uncollateralized transactions struck at different levels of moneyness: at the money (swaption strike set equal to the market 10 years spot swap rate) and ±50 basis points. The upper graph reports the case with no initial lockout period while in the lower one we assume that the option cannot be exercised in the first 2 years. When the option can be exercised with no restrictions, we observe a moderate inverse relationship between moneyness and WWR impact due to the protection mechanism: the opportunity to early exercise when the exposure is large limits the effect of increased counterparty default probabilities. On the other hand, the introduction of a lockout period intensifies the WWR impact. Intuitively, by expanding the lockout period we move toward the limiting case of a European option. In this case the moneyness–WWR effect is reversed: the more in the money the option is, the more relevant theWWR effect becomes. During the lockout period the in-the-money option has a considerably higher exposure to counterparty default that cannot be mitigated via early termination.

<sup>10</sup>Often the bond can be called at any coupon payment date after an initial lockout period.

<sup>11</sup>We define it to be the ratio *CVAW* /*CVA* as given, respectively, by (5) and (2).

**Fig. 4** Impact of WWR on Bermudan receiver swaptions as function of WWR parameter *b* for several levels of moneyness. Market data as of September 13, 2012. Counterparty CDS *curve flat* at 125 basis points

# **5 Concluding Remarks**

Nowadays WWR is a crucial concern in OTC derivatives transactions. This is particularly true for uncollateralized trades that a financial institution could have in place with medium-sized corporate clients. The presence of early termination clauses in vulnerable derivatives portfolios makes the CVA computation even more tricky. We have shown a simple and effective approach to deal with calibration and pricing of CVA within the Hull–White framework [11] for American or Bermudan options. We extended the procedure in [1] to the dynamic programing algorithm required to take into account the free boundary problem inherent in the pricing of such derivatives. Numerical tests carried out underline the importance of adequate procedures to differentiate CVA profiles for European and American options. The possibility of early exercise, indeed, plays a remarkable role in mitigating the WWR: an undifferentiated CVA pricing for contingent claims with different exercise styles would then lead to severe misspecification of regulatory capital charges.

An interesting topic for further research would consider the impact of counterparty defaultability in defining the dealer's optimal exercise strategy. Even if intuitive, this poses nontrivial problems mainly due to the interrelation among derivative pricing, WWR, and calibration of function *a*(*t*). It is our opinion, however, that the described framework could be extended in this direction.

**Acknowledgements** The KPMG Center of Excellence in Risk Management is acknowledged for organizing the conference "Challenges in Derivatives Markets - Fixed Income Modeling, Valuation Adjustments, Risk Management, and Regulation".

**Open Access** This chapter is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, duplication, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, a link is provided to the Creative Commons license and any changes made are indicated.

The images or other third party material in this chapter are included in the work's Creative Commons license, unless indicated otherwise in the credit line; if such material is not included in the work's Creative Commons license and the respective action is not permitted by statutory regulation, users will need to obtain permission from the license holder to duplicate, adapt or reproduce the material.

# **References**


# **Simultaneous Hedging of Regulatory and Accounting CVA**

**Christoph Berns**

**Abstract** As a consequence of the recent financial crisis, Basel III introduced a new capital charge, the CVA risk charge to cover the risk of future CVA fluctuations (CVA volatility). Although Basel III allows for hedging the CVA risk charge, mismatches between the regulatory (Basel III) and accounting (IFRS) rules lead to the fact that hedging the CVA risk charge is challenging. The reason is that the hedge instruments reducing the CVA risk charge cause additional Profit and Loss (P&L) volatility. In the present article, we propose a solution which optimizes the CVA risk charge and the P&L volatility from hedging.

**Keywords** CVA risk charge · Accounting CVA · Hedging · Optimization

# **1 Introduction**

Counterparty credit risk is the risk that a counterparty in a derivatives transaction will default prior to expiration of the trade and will therefore not be able to fulfill its contractual obligations. Before the recent financial crisis many market participants believed that some counterparties will never fail ("too big to fail") and therefore counterparty risk was generally considered as not significant. This view changed due to the bankruptcy of Lehman Brothers during the financial crisis and market participants realized that even major banks can fail. For that reason, counterparty risk is nowadays considered to be significant for investment banks. The International Financial Reporting Standards (IFRS) demand that the fair value of a derivative incorporates the credit quality of the counterparty. This is achieved by a valuation adjustment which is commonly referred to as credit valuation adjustment (CVA), see e.g. [3–5]. The CVA is part of the IFRS P&L, i.e. losses (gains) caused by changes of the counterparties credit quality reduce (increase) the balance sheet equity.

Basel III requires a capital charge for future changes of the credit quality of derivatives, i.e. CVA volatility. Banks can either use a standardized approach to

C. Berns (B)

KPMG AG, Am Flughafen, 60549 Frankfurt am Main, Germany e-mail: christophberns@kpmg.com

<sup>©</sup> The Author(s) 2016

K. Glau et al. (eds.), *Innovations in Derivatives Markets*, Springer Proceedings in Mathematics & Statistics 165, DOI 10.1007/978-3-319-33446-2\_6

compute this capital charge or an internal model [2]. The latter charge is commonly referred to as CVA risk charge. Many banks have implemented a CVA desk in order to manage actively their CVA risk. CVA desks buy CDS protection on the capital markets to hedge the counterparty credit risk of uncollateralized derivatives which have been bought by the ordinary trading desks. Recognizing that banks actively manage CVA positions, Basel III allows for hedging the CVA risk charge using credit hedges such as single name CDSs and CDS indexes. However, the recognition of hedges is different depending on whether the standardized approach or an internal model is used [2].

Summarizing, we can look at counterparty credit risk from two different perspectives: the regulatory (Basel III) and the accounting (IFRS) one. Depending whether we consider counterparty risk from a regulatory or accounting perspective, different valuation methods are applied for this risk. In general, the regulatory treatment of counterparty risk is more conservative than the accounting one, cf. [6]. The difference between the regulatory and the accounting treatment of counterparty risk causes the following problem in hedging the CVA risk charge: eligible hedge instruments such as CDSs would lead to a reduction of the CVA risk charge. On the other hand, under IFRS, a CDS is recognized as a derivative and thus accounted at fair value through profit and loss and therefore introducing further P&L volatility.

The current accounting and regulatory rules expose banks to the situation that they cannot achieve regulatory capital relief and low P&L volatility simultaneously. Deutsche Bank, for instance, has largely hedged the CVA risk charge in the first half of 2013. The hedging strategy that reduced the CVA risk charge has caused large losses due to additional P&L volatility, cf. [7]. This example illustrates the mismatch between the regulatory and accounting treatment of CVA.1 The mismatch demands for a trade-off between these two regimes, cf. [8]. For this reason, we propose in this article an approach which leads to an optimal allocation between CVA risk charge reduction and P&L volatility. Our considerations are restricted to the standardized CVA risk charge.

We start with an explanation of the standardized CVA risk charge, i.e. the regulatory treatment of CVA. Afterwards, we show that the standardized CVA charge can be interpreted as a (scaled) volatility/variance of a portfolio of normally distributed positions. This interpretation reveals the modeling assumptions of the regulator and will be crucial for the later considerations. In a next step, we explain the counterparty risk modeling from an accounting perspective and we compute the impact of the hedge instruments (used to reduce the CVA risk charge) to the overall P&L volatility, assuming that the risk factor returns are normally distributed. Without the mismatch between the regulatory and the accounting regime, the hedge instruments would move anti-correlated to the corresponding accounting CVAs and the resulting common volatility would be small. Due to the mismatch, the CVA and the hedge instrument changes will not offset completely. For this reason we introduce a

<sup>1</sup>Due to the exclusion of DVA from the Basel III regulatory calculation, the mismatch potentially intensifies.

synthetic2 total volatility σ*syn* consisting basically of the sum of the additional accounting P&L volatility σ*hed* caused by fair value changes of the hedge instruments (hedge P&L volatility) and the regulatory CVA volatility σ*CVA*,*reg* (i.e. basically the CVA risk charge)3:

$$
\sigma\_{\text{syn}}^2 = \sigma\_{\text{head}}^2 + \sigma\_{CVA, \text{reg}}^2. \tag{1}
$$

Hence, (1) defines a steering variable describing the common effects of CVA risk charge hedging and resulting P&L volatility. One should mention that formula (1) may suggest statistical independence of the two quantities. However, there exists a dependence in the following sense: both the regulatory CVA volatility and the hedge P&L volatility depend on the hedge amount. The more we hedge, the smaller the σ*CVA*,*reg*. On the other hand, the more we hedge, the larger the σ*hed* . The definition of the synthetic volatility as a sum of σ<sup>2</sup> *hed* and σ<sup>2</sup> *CVA*,*reg* can be motivated by the following consideration: the term σ<sup>2</sup> *CVA*,*reg* is related to the regulatory capital demand for CVA risk. The other term, σ<sup>2</sup> *hed* , can be interpreted as capital demand for market risk of the hedge instruments. Although the hedge instruments are excluded from the regulatory capital demand computation for market risk, they potentially reduce the balance sheet equity and therefore may reduce the available regulatory capital. The sum in (1) is now motivated by the additivity of the total capital demand.

In the following we will consider σ*syn* as function of the hedge amount and search for its minimum. The hedge amount minimizing σ*syn* leads to the optimal allocation between CVA risk charge relief and P&L volatility. We will derive analytical solutions. The discussion of several special cases will provide an intuitive understanding of the optimal allocation. For technical reasons we exclude index hedges in the derivation of the optimal hedge strategy. However, it is easy to generalize the results to the case where index hedges are allowed.

# **2 Counterparty Risk from a Regulatory Perspective: The Standardized CVA Risk Charge**

In this section we introduce the standardized CVA risk charge. A detailed explanation of all involved parameters is given in the Basel III document [2]. The formula for the standardized CVA risk charge is prescribed by the regulator and is used to determine the amount of regulatory capital which banks must hold in order to absorb possible losses caused by future deteriorations of the counterparties credit quality. We will see that the standardized CVA risk charge can be interpreted as volatility (i.e. standard deviation) of a normally distributed random variable. More precisely, we will show that the CVA risk charge can be interpreted as the 99 % quantile of a portfolio of

<sup>2</sup>We use the word synthetic since σ*syn* mixes a volatility measured in regulatory terms and a volatility measured in accounting terms.

<sup>3</sup>This connection will be explained later.

positions subject to normally distributed CVA changes (i.e. CVA P&L) only. This gives some insights into the regulators modeling assumptions for future CVA. It is worth to mention that the regulators modeling assumptions may hold or not hold. A detailed look at the regulators modeling assumptions can be found in [6].

In order to be prepared for later computations, we introduce in this section some notations and recall some facts about normally distributed random variables.

The standardized CVA risk charge *K* is given by [2]:

$$K = \beta \sqrt{h} \Phi^{-1}(q) \tag{2}$$

with


$$\begin{split} \beta^2 &= \left(\sum\_{i=1}^n 0.5 \cdot \omega\_i \left(M\_i E A D\_i - M\_i^{hd} B\_i\right) - \alpha\_{ind} M\_{ind} B\_{ind}\right)^2 \\ &+ \sum\_{i=1}^n 0.75 \cdot \omega\_i^2 \left(M\_i E A D\_i - M\_i^{hd} B\_i\right)^2 \end{split} \tag{3}$$

with


Formula (2) is determined by the regulator. In order to get a better understanding of this formula, we will derive a stochastic interpretation of it. Before that, we need to recall a fact about normal distributions: if the random vector *X* has a multivariate normal distribution, i.e. *X* ∼ *N* (0,Σ) with mean 0 and covariance matrix Σ, then, for a deterministic vector *a*, the scalar product

$$\langle \vec{a}, \vec{X} \rangle := \sum\_{i} a\_{i} \mathbf{X}\_{i} \tag{4}$$

<sup>4</sup>For simplicity we consider only one index hedge. The results in this article can easily be generalized to more than one index hedge.

has a univariate normal distribution with mean 0 and variance

$$
\sigma^2 = \langle \vec{a}, \Sigma \vec{a} \rangle. \tag{5}
$$

Now we are able to derive the stochastic interpretation of the CVA risk charge, more precise the interpretation as volatility.

# *2.1 Standardized CVA Risk Charge as Volatility*

In this section we will show that the regulators' modeling assumptions behind the standardized CVA risk charge are given by normally distributed CVA returns which are aggregated by using a one-factor Gaussian copula model.<sup>5</sup> We consider *n* counterparties. By *Ri*, we denote the (one year) CVA P&L (i.e. those P&L effects caused by CVA changes) w.r.t. counterparty *i*.

**Lemma 1** *If one assumes Ri* ∼ *N* (0, σ<sup>2</sup> *<sup>i</sup>* ) *and further, if one assumes that the random vector*<sup>6</sup>

$$\bar{R} = (R\_1, \dots, R\_n)^r$$

*is distributed according to a one-factor Gaussian copula model, i.e. R* ∼ *N* (0,Γ) *with* Γ*ii* = σ<sup>2</sup> *<sup>i</sup> and* Γ*ij* = ρσ*i*σ*<sup>j</sup> with* ρ *independent of i and j for i* = *j, then the* 99 % *quantile of the distribution of R is equal to the CVA risk charge ( 2).*

*Proof* Using (4) and (5), we find that the aggregated CVA return (common CVA P&L) *RCVA*,*reg* := *<sup>n</sup> <sup>i</sup>*=<sup>1</sup> *Ri* = 1, *<sup>R</sup>*<sup>7</sup> has the distribution *<sup>N</sup>* (0, σ<sup>2</sup> *CVA*,*reg*) with

$$
\sigma\_{\text{CVA,reg}}^2 = \langle \vec{\text{l}}, \varGamma \vec{\text{l}} \rangle = \sum\_{i,j=1}^n \varGamma\_{i,j} = \left( \sqrt{\rho} \sum\_{i=1}^n \sigma\_i \right)^2 + (1 - \rho) \sum\_{i=1}^n \sigma\_i^2 \tag{6}
$$

If we compare the above expression with (3), we see that this expression is equal to β<sup>2</sup> (with *Bind* = 0, i.e. no index hedges) if we set ρ = 0.25 and σ*<sup>i</sup>* = ω*i*(*MiEADi* − *Mhed <sup>i</sup> Bi*). The quantile interpretation of the CVA risk charge (i.e. Formula (2)) follows from standard properties of the normal distribution.

The above lemma shows that the standardized CVA risk charge is basically the volatility of the sum *<sup>i</sup> Ri* of *n* normally distributed random variables. The normally distributed random variables are equicorrelated: ρ(*Ri*, *Rj*) = 0.25. Each CVA return *Ri* has the volatility

$$
\sigma\_i = \omega\_i (M\_i E A D\_i - M\_i^{hcd} B\_i). \tag{7}
$$

<sup>5</sup>This is a very strong assumption that might not be true in reality.

<sup>6</sup>By · *<sup>t</sup>* we denote the transpose of a vector/matrix.

<sup>7</sup>By 1 we denote the vector (1,..., 1)*<sup>t</sup>* .

Hence, buying credit protection on counterparty *i* reduces the corresponding CVA volatility. If we assume *Mi* <sup>=</sup> *<sup>M</sup>hed <sup>i</sup>* , the optimal hedge w.r.t. counterparty *i* is given by a CDS with notional amount *Bi* equals

$$B\_i = EAD\_i. \tag{8}$$

# **3 Counterparty Risk from an Accounting Perspective**

As explained in the introduction, counterparty risk from an accounting perspective is quantified by a fair value adjustment called credit valuation adjustment (CVA). The CVA reduces the present value (PV) of a derivatives portfolio in order to incorporate counterparty risk:

$$PV = PV\_{riskfree} - CVA\_r$$

whereby *PVriskfree* denotes the market value of the portfolio without counterparty risk and CVA is the adjustment to reflect counterparty risk. For the modeling of CVA, banks have some degrees of freedom. Typically, the accounting CVA is computed by means of the following formula (see e.g. [4]):

$$CVA = \int\_0^T D(t)EE(t)dP(t)\tag{9}$$

with *T* the effective maturity of the derivatives portfolio, *D*(*t*) the risk-free discount curve, *EE*(*t*) = *E*[max{0, *PV*(*t*)}] the (risk-neutral) expected positive exposure at (future time point) *t*, and *dP*(*t*) is the (risk-neutral) default probability of the counterparty in the infinitesimal interval [*t*, *t* + *dt*]. For the implementation of (9), a discretization of the integral is necessary. Many banks assume a constant *EE* profile (i.e. *EE*(*t*) = *EE*<sup>∗</sup> for all *t*). In that case, (9) simplifies to

$$
\Gamma VA = EE^\* \int\_0^T D(t)dP(t). \tag{10}
$$

Further, the (risk-neutral) default probabilities are typically modeled by a hazard rate model, i.e. one assumes that the default time is exponentially distributed with parameter λ. Using this assumption, we can write:

$$CVA = \lambda EE^\* \int\_0^T D(t)e^{-\lambda t}dt.\tag{11}$$

The approximation (11) will be helpful in the next section, where we describe the hedging of CVA from an accounting perspective

# *3.1 CVA Hedging from an Accounting Perspective*

In previous sections we have seen that the regulatory CVA hedging (i.e. CVA risk charge hedging) can be achieved by buying credit protection. Effectively, (7) says that the regulatory exposure is reduced by the notional amount of the bought credit protection. At this place, we describe CVA hedging from an accounting perspective.

Let us consider a derivatives portfolio with a single counterparty. In order to hedge the corresponding counterparty risk, one can buy, for example, a single name CDS such that the CVA w.r.t. the counterparty together with the CDS is Delta neutral (i.e. up to first order, CVA movements are neutralized by the CDS movements). The condition for Delta neutrality is

$$
\Delta CVA = \Delta CDS\tag{12}
$$

whereby Δ describes the derivative of the CVA and CDS respectively (w.r.t. the credit spread of the counterparty). To be more precise, the default leg of the CDS should compensate the CVA movements. Using a standard valuation model for a CDS (see e.g. [4]) and computing the derivatives in (12), it is easy to see that (12) is equivalent to

$$B = EE^\*,$$
 
$$(1\Im)$$

i.e. the optimal hedge amount is given by *EE*∗. Typically, *EE*<sup>∗</sup> is given by the average of the expected positive exposures *EE*(*t*) at future time points *t*:

$$EE^\* = \frac{1}{T} \int\_0^T EE(t)dt.\tag{14}$$

If we compare (13) with (8) we see that the optimal hedge notional amount for hedging CVA risk from a regulatory perspective is the regulatory exposure *EAD*, while the optimal hedge notional amount for hedging accounting CVA risk is given by *EE*∗. In general it holds *EAD* > *EE*∗, due to conservative assumptions made by the regulator<sup>8</sup> (we refer to [6] for a detailed comparison of these two quantities). Thus, hedging CVA risk differs whether it is considered from an accounting or a regulatory perspective. This mismatch causes additional P&L volatility in the accounting framework, if the CVA risk is hedged from a regulatory perspective (i.e. if the CVA risk charge is hedged).

Finally we remark that we can write the CVA sensitivities <sup>Δ</sup>*CVA* <sup>=</sup> *<sup>d</sup> dsCVA* as

$$
\Delta\_{CVA} = EE^\* \Delta\_{CDS}, \tag{15}
$$

whereby Δ*CDS* is the sensitivity of (the default leg of) a CDS with notional amount *B* = 1.

<sup>8</sup>For example, the alpha multiplier in the IMM context overstates the EAD by a factor of 1.4. Further, the non-decreasing constraint to the exposure profile leads to an overstatement, see [6] for details.

# **4 Portfolio P&L**

As explained above, the hedge instruments reduce the (regulatory) counterparty credit risk. But they may cause new market risk due to additional P&L volatility. However, although in accordance with Basel III eligible hedge instruments are excluded from market risk RWA calculations, the additional P&L volatility of the hedge instruments leads to fluctuations in reported equity. In order to describe the effects of hedging to the overall P&L, we introduce in the present section the corresponding framework. We divide the overall P&L in different parts: the P&L of the hedge instruments, the P&L of the remaining positions, and the CVA P&L. The framework will be helpful later on, when we want to quantify the impact of the CVA risk charge hedges to the accounting P&L.

# *4.1 Portfolio P&L Without CVA*

Let us assume that a bank holds derivatives with *n* different counterparties for which single name CDS exists. The bank has to decide to which extent it hedges the counterparty risk w.r.t. these counterparties by either single name CDS or index hedges. By Σ we denote the correlation matrix (of dimension *N* × *N*, *N* > *n*) of all risk factors *ri*, *i* = 1,...,*N* the banks (trading) portfolio is exposed to. Without loss of generality, we assume that the correlations between the CDS of the considered *n* counterparties are given by the first *n* × *n* components of Σ, i.e. Σ*<sup>i</sup>*,*<sup>j</sup>* = ρ(*CDSi*,*CDSj*), *i*, *j* = 1,..., *n*. Further, Σ*<sup>n</sup>*+1,*<sup>i</sup>* denotes the correlation between the index hedge and the CDS on counterparty *i* ∈ {1,..., *n*}. The whole portfolio Π of the bank contains the hedge instruments (CDS and index hedge) as well as other instruments (e.g. bonds): Π = Π*hed* ∪ Π*rest*. The sub-portfolio Π*hed* is driven by the credit spreads of the counterparties. Note that Π*rest* may depend on some of these credit spreads as well. In the following, we will assume the P&L of the portfolio Π is given by:

$$P\&L = \sum\_{i=1}^{n} (B\_i \Delta\_i + \Delta\_{i, \text{rest}}) dr\_i + B\_{\text{ind}} \Delta\_{\text{ind}} dr\_{\text{ind}} + \sum\_{j=n+2}^{N} \Delta\_j dr\_j,\tag{16}$$

whereby Δ*<sup>i</sup>* denotes the sensitivity of *CDSi* w.r.t. the corresponding credit spread, Δ*<sup>i</sup>*,*rest* denotes the sensitivity of the remaining positions which are sensitive w.r.t. the credit spread of counterparty *i* as well,9 *Bi* (resp. *Bind* ) denotes the notional of *CDSi* (resp. the notional of the index hedge), and *dri* describes the change of the risk factor *ri* (the first *n* risk factors are the credit spreads) in the considered time period.

<sup>9</sup>For example, if Π*rest* contains a bond emitted by the counterparty *i*, then (ignoring the Bond-CDS Basis) Δ*i*,*rest* = −Δ*i*).

# *4.2 Impact with CVA*

This section extends the above considerations to the case where we allow for a CVA component. We define the total P&L as the difference between the P&L given by (16) and the CVA P&L:

$$P\&L\_{tot} = P\&L - P\&L\_{CVA},\tag{17}$$

whereby *P*&*LCVA* is defined in a similar manner as in (16) 10:

$$P \& L\_{CVA} = \sum\_{i=1}^{n+1} \Delta\_{i,CVA} dr\_i. \tag{18}$$

In (18), the risk factors *ri* are the same risk factors which appear in the first *n* + 1 summands of (16). This is because the CVAs are driven by the same risk factors as the corresponding hedge instruments. Recall that in a setup where counterparty risk is completely hedged, the P&L of the hedge instruments is canceled out by the P&L of the CVAs. This is the case, if the corresponding sensitivity is equal. In Sect. 3.1 we have shown how one can achieve this (using the condition of Delta neutrality) by choosing the right hedge notional amounts.

# *4.3 Impact of CVA Risk Charge Hedging on the Accounting P&L Volatility*

The additional P&L volatility caused by the hedge instruments is basically given by the residual volatility of the hedge instruments which is not canceled by the CVAs. In order to derive an expression for this volatility, we start with the derivation of the volatility of the total portfolio P&L. The residual volatility will consist of those parts of the total volatility which are sensitive w.r.t. the hedge instruments.

In order to proceed, we have to introduce the following notations: the vector Δ*CVA* ∈ R*<sup>n</sup>*+<sup>1</sup> contains the CVA sensitivities and the return vector *dr* ∈ R*<sup>N</sup>* describes the changes of the *N* risk factors the trading book is exposed to. We further introduce the sensitivity vectors<sup>11</sup> Δ = (Δ1,...,Δ*ind* ,...,Δ*N*)*<sup>t</sup>* ∈ R*<sup>N</sup>*

<sup>10</sup>We consider only the credit spreads as risk factors. Exposure movements due to changes in market risk factors are not considered. This is unproblematic for the considerations in this article since we will end up with dynamic CVA hedging strategy (cf. Sect. 5) which incorporates the exposure changes.

<sup>11</sup>The first *<sup>n</sup>* components of <sup>Δ</sup> are the CDS sensitivities w.r.t. credit spread changes and the *<sup>n</sup>* <sup>+</sup> 1th component is the sensitivity of the index hedge.

and12Δ*rest* = (Δ1,*rest*,...,Δ*<sup>n</sup>*,*rest*)*<sup>t</sup>* ∈ R*<sup>n</sup>*,the notional vector*B* = (*B*1,..., *Bn*, *Bind* )*<sup>t</sup>* ∈ R*<sup>n</sup>*+<sup>1</sup> and the diagonal matrix *Q*<sup>Δ</sup> = *diag*(Δ1,...,Δ*n*, Δ*ind* ) ∈ R(*n*+1)×(*n*+1) .

**Lemma 2** *If we assume that the portfolio P&L is given by (17) and if we further assume dr* ∼ *N* (0,Σ) *(for some correlation matrix* Σ*), then the squared volatility (i.e. the variance) of (17) is given by*<sup>13</sup>

$$\begin{split} \sigma^{2}\_{P\&L\_{\text{ref}}} &= \left\langle \left(\begin{array}{c} \mathcal{Q}\_{\Delta}\vec{B} \\ \vec{0} \end{array} \right) \Sigma \begin{pmatrix} \mathcal{Q}\_{\Delta}\vec{B} \\ \vec{0} \end{pmatrix} \right\rangle + \left\langle \left(\begin{array}{c} \vec{\tilde{\Delta}}\_{\text{VA}} \\ \vec{0} \end{array} \right) \Sigma \begin{pmatrix} \vec{\tilde{\Delta}}\_{\text{VA}} \\ \vec{0} \end{pmatrix} \right\rangle \\ &+ \left\langle \left(\begin{array}{c} \vec{\tilde{\Delta}}\_{\text{rest}} \\ \vec{\tilde{\Delta}}\_{N-n-1} \end{array} \right) \Sigma \begin{pmatrix} \vec{\tilde{\Delta}}\_{\text{rest}} \\ \vec{\tilde{\Delta}}\_{N-n-1} \end{pmatrix} \right\rangle - 2 \left\langle \left(\begin{array}{c} \mathcal{Q}\_{\Delta}\vec{B} \\ \vec{0} \end{array} \right) \Sigma \begin{pmatrix} \vec{\tilde{\Delta}}\_{\text{VA}} \\ \vec{0} \end{pmatrix} \right\rangle \\ &+ 2 \left\langle \left(\begin{array}{c} \mathcal{Q}\_{\Delta}\vec{B} \\ \vec{0} \end{array} \right) \Sigma \begin{pmatrix} \vec{\tilde{\Delta}}\_{\text{rest}} \\ \vec{\tilde{\Delta}}\_{N-n-1} \end{pmatrix} \right) - 2 \left\langle \left(\begin{array}{c} \vec{\tilde{\Delta}}\_{\text{rest}} \\ \vec{\tilde{\Delta}}\_{N-n-1} \end{array} \right) \Sigma \begin{pmatrix} \vec{\tilde{\Delta}}\_{\text{VA}} \\ \vec{0} \end{pmatrix} \right\rangle. \end{split} (19)$$

*Proof* With the above defined vectors, we can write:

$$\begin{split} P \& L\_{\text{tot}} &= \langle Q\_{\Delta} \vec{B} - \vec{\triangle}\_{\text{CVA}}, \vec{dr}\_{n+1} \rangle + \langle \vec{\triangle}\_{\text{rest}}, \vec{dr}\_{n+1} \rangle + \langle \vec{\Delta}\_{N-n-1}, \vec{dr}\_{N-n-1} \rangle \\ &= \left\langle \begin{pmatrix} \mathcal{Q}\_{\Delta} \vec{B} \\ \vec{0}\_{N-n-1} \end{pmatrix} - \begin{pmatrix} \vec{\triangle}\_{\text{CVA}} \\ \vec{0}\_{N-n-1} \end{pmatrix} + \begin{pmatrix} \vec{\triangle}\_{\text{rest}} \\ \vec{\Delta}\_{N-n-1} \end{pmatrix}, \vec{dr} \right\rangle \\ &= \langle \vec{a} - \vec{b} + \vec{c}, \vec{dr} \rangle \end{split} \tag{20}$$

whereby *dr <sup>n</sup>*+<sup>1</sup> denotes the *n* + 1-dimensional vector consisting of the first *n* + 1 components of *dr* , *dr <sup>N</sup>*−*n*−<sup>1</sup> consists of the remaining *N* − *n* − 1 components of *dr* , Δ*N*−*n*−<sup>1</sup> denotes the vector of the remaining *N* − *n* − 1 sensitivities, and 0*N*−*n*−<sup>1</sup> is the *N* − *n* − 1-dimensional vector whose components are all equal to 0.14 Clearly, the vectors *a*, *b* and *c* coincide with the respective summands of the left hand side of the scalar product in (20). If we use *dr* ∼ *N* (0,Σ), it follows from (4) to (5):

$$\begin{split} \sigma^2\_{\mathbb{P}\&L\_{\text{int}}} &= \langle \vec{a} - \vec{b} + \vec{c}, \,\Sigma(\vec{a} - \vec{b} + \vec{c}) \rangle \\ &= \langle \vec{a}, \,\Sigma \vec{a} \rangle + \langle \vec{b}, \,\Sigma \vec{b} \rangle + \langle \vec{c}, \,\Sigma \vec{c} \rangle - 2 \langle \vec{a}, \,\Sigma \vec{b} \rangle + 2 \langle \vec{a}, \,\Sigma \vec{c} \rangle - 2 \langle \vec{c}, \,\Sigma \vec{b} \rangle. \end{split} (21)$$

If we plug in the expressions for *a*, *b* and *c*, we obtain (19). -

In order to be prepared for later computations, we will further simplify Expression (19). To this end, we introduce the following notations: by Σ*<sup>n</sup>*+<sup>1</sup> we denote the (*n* + 1) × (*n* + 1) matrix consisting of the first *n* + 1 column and row entires of Σ only, i.e. Σ*<sup>i</sup>*,*<sup>j</sup>*, *i*, *j* = 1,... *n* + 1. The matrix Σ*<sup>N</sup>*,*n*+<sup>1</sup> is the *N* × (*n* + 1) matrix

<sup>12</sup>The vector Δ*rest* contains the *n* sensitivities w.r.t. credit spread changes of those trading book positions which are different from the CDSs used for hedging but are sensitive w.r.t. to the credit spreads of the hedge instruments as well.

<sup>13</sup>The vector <sup>Δ</sup>*N*−*n*−<sup>1</sup> is defined in the proof.

<sup>14</sup>In the following, we will omit the index *<sup>N</sup>* <sup>−</sup> *<sup>n</sup>* <sup>−</sup> 1 and simply write 0.

obtained from <sup>Σ</sup> by deleting the last *<sup>N</sup>* <sup>−</sup> *<sup>n</sup>* <sup>−</sup> 1 columns and <sup>Σ</sup>*<sup>t</sup> <sup>N</sup>*,*n*+<sup>1</sup> denotes its transpose matrix. With this notation and using that 0 cancels many components in (19), we can write:

$$\begin{split} \sigma^{2}\_{P\&L\_{\text{ref}}} &= \langle \mathcal{Q}\_{\Delta}\vec{B}, \Sigma\_{n+1}\mathcal{Q}\_{\Delta}\vec{B} \rangle + \langle \vec{\Delta}\_{\text{CVA}}\Sigma\_{n+1}, \vec{\Delta}\_{\text{CVA}} \rangle - 2 \langle \mathcal{Q}\_{\Delta}\vec{B}, \Sigma\_{n+1}\vec{\Delta}\_{\text{CVA}} \rangle \\ &+ 2 \left\langle \vec{B}, \mathcal{Q}\_{\Delta}\Sigma^{\prime}\_{N,n+1} \left( \begin{array}{c} \vec{\Delta}\_{\text{rest}} \\ \vec{\Delta}\_{N-n-1} \end{array} \right) \right\rangle - 2 \left\langle \vec{\Delta}\_{\text{CVA}}, \Sigma^{\prime}\_{N,n+1} \left( \begin{array}{c} \vec{\Delta}\_{\text{rest}} \\ \vec{\Delta}\_{N-n-1} \end{array} \right) \right\rangle \\ &+ \left\langle \left( \begin{array}{c} \vec{\Delta}\_{\text{rest}} \\ \vec{\Delta}\_{N-n-1} \end{array} \right) \Sigma \left( \begin{array}{c} \vec{\Delta}\_{\text{rest}} \\ \vec{\Delta}\_{N-n-1} \end{array} \right) \right\rangle. \end{split} (22)$$

In (22), the first summand describes the volatility of the hedge instruments if they are considered as isolated from the remaining positions (i.e. those positions which are different from the hedge instruments). Analogously, the other quadratic terms (i.e. the second and the last summand in (22)) represent the volatility of the CVA and the remaining positions respectively. The cross terms (third, fourth, and fifth summand) describe the interactions between the volatility of the hedge instruments, the CVA and the remaining positions. For example, the third term describes the interaction between the CVA and the hedge instruments.

The P&L volatility σ<sup>2</sup> *hed* caused by the hedge instruments is given by those terms of (22) which depend on the hedge instruments, i.e. those terms which depend on *B*. These are the first, the third, and the fourth term of (22), i.e.

$$
\sigma\_{hel}^2 = \langle \mathcal{Q}\Delta\vec{B}, \Sigma\_{n+1}\mathcal{Q}\Delta\vec{B}\rangle - 2\langle \mathcal{Q}\Delta\vec{B}, \Sigma\_{n+1}\vec{\Delta}\_{CVA}\rangle + 2\left|\vec{B}, \mathcal{Q}\Delta^l\_{N,n+1}\begin{pmatrix} \vec{\Delta}\_{\text{rest}}\\ \vec{\Delta}\_{N-n-1} \end{pmatrix}\right|.\tag{23}
$$

The other terms of (22) describe the volatility caused by the remaining positions.

In order to simplify the notation, we write σ<sup>2</sup> *hed* in the following way:

$$
\sigma\_{hcd}^2 = \langle \vec{AB}, \vec{B} \rangle + \langle \vec{B}, \vec{b} \rangle \tag{24}
$$

with

$$A = \mathcal{Q}\_{\Delta} \Sigma\_{n+1} \mathcal{Q}\_{\Delta} \tag{25}$$

and

$$
\vec{b} = \mathcal{Q}\_{\Delta} \Sigma\_{N, n+1}^{l} \begin{pmatrix} \vec{\Delta}\_{\text{rest}} \\ \vec{\Delta}\_{N-n-1} \end{pmatrix} - \mathcal{Q}\_{\Delta} \Sigma\_{n+1} \vec{\Delta}\_{\text{CVA}}.\tag{26}
$$

Note that σ<sup>2</sup> *hed* is not simply given by a quadratic form but also incorporates a linear part. The quadratic form describes the volatility of a portfolio consisting of the hedge instruments, while the linear part describes the correlations of the hedge instruments with the remaining positions and with the CVAs.

#### **4.3.1 Definition of the Steering Variable**

We now define a steering variable aiming to define a unified framework for CVA risk charge hedging and P&L volatility. The steering variable is given by a synthetic volatility consisting of the sum of the regulatory CVA volatility and the volatility of the accounting P&L caused by the hedge instruments:

$$
\sigma\_{\text{syn}}^2 = \sigma\_{CVA, \text{reg}}^2 + \sigma\_{hed}^2. \tag{27}
$$

The synthetic volatility unifies both the regulatory and the accounting framework. It can be considered as a function of the hedge notional amounts. The minimum of σ<sup>2</sup> *tot*,*syn* describes the optimal allocation between CVA risk charge reduction and P&L volatility. Note that σ<sup>2</sup> *tot*,*syn* contains now the matrices Γ and Σ, who describe the correlations between the same risk factors. This mismatch can be resolved, if the advanced CVA risk charge is used [2]. However, the use of different CVA sensitivities cannot be resolved. The most significant differences arise due to different exposure definitions: while the exposures *EADi* contained in the regulatory CVA sensitivities are based on the effective EPE and multiplied by the alpha multiplier (for IMM banks), this is not the case for the exposures used to compute the accounting CVA sensitivities. In general, these mismatches will lead to smaller accounting CVA sensitivities. Thus, a complete hedging of the CVA risk charge leads to an overhedged accounting CVA. See [6] for a complete description of the sources of the mismatch. Another source of potential overhedging is the following: if accounting CVA is already hedged by instruments which are not eligible hedge instruments in the sense of Basel III, additional hedge instruments are necessary for the hedging of the CVA risk charge. These hedge instruments will cause additional P&L volatility, since their offsetting counterparts (i.e. the CVAs) are not present (since they are already hedged).

# **5 Determination of the Optimal Hedge Strategy**

This section describes concretely how the mismatch between the regulatory regime and the accounting regime can be mitigated. The result will be a dynamic CVA hedging strategy based on an optimization principle of the steering variable introduced in the previous section. We will ignore index hedges but all results can easily be generalized to the case where index hedges are included.

As opposed to the previous sections, the vector *B* will not contain the component *Bind* in this section. As explained before, we want to minimize the synthetic volatility15

$$
\sigma\_{\text{syn}}^2(\bar{B}) = \sigma\_{\text{hcd}}^2(\bar{B}) + \sigma\_{CVA}^2(\bar{B}) \tag{28}
$$

<sup>15</sup>We ignore the index *tot*.

as a function of *B*. The component *B*<sup>∗</sup> *<sup>i</sup>* of the minimum *B*∗ describes the optimal notional amounts of *CDSi*, used to hedge the counterparty risk w.r.t. counterparty *i*. We now determine *B*∗ by computing the zeros of the first derivative of σ<sup>2</sup> *syn*.

**Theorem 1** *Under the same assumptions as in Lemma 2, the minimum B*∗ *of (28) is given by*<sup>16</sup>

$$
\vec{B}^\* = H^{-1}\vec{f} \tag{29}
$$

*with*

$$H := \mathcal{Z}(A + \mathcal{Q}\_{M^{hd}} \Gamma \mathcal{Q}\_{M^{hd}}) \tag{30}$$

*and*

$$
\vec{f} := 2Q\_{M^{\text{bd}}} \Gamma Q\_M \overrightarrow{EAD} - \vec{b}.\tag{31}
$$

*Proof* In order to keep the display of the computations clear, we introduce the diagonal matrices *QM* := *diag*(ω1*M*1,...,ω*nMn*) and *QMhed* := *diag*(ω1*Mhed* <sup>1</sup> ,..., ω*nMhed <sup>n</sup>* ) and the *<sup>n</sup>*-dimensional vector −−→*EAD* whose components are given by the counterparty exposures. Using these definitions, we can write:

$$
\sigma\_{CVA}^2 = \langle \mathcal{Q}\_M \overrightarrow{EAD} - \mathcal{Q}\_{M^{bd}} \vec{B}, \,\Gamma(\mathcal{Q}\_M \overrightarrow{EAD} - \mathcal{Q}\_{M^{bd}} \vec{B}) \rangle. \tag{32}
$$

whereby Γ describes the constant correlation between the CVAs (all diagonal elements given by 1). Using (32) and (24), we can write:

$$\begin{split} \frac{\partial \sigma\_{\mathcal{ym}}^{2}}{\partial \vec{B}} &= \frac{\partial}{\partial \vec{B}} (\langle A\vec{B}, \vec{B} \rangle + \langle \vec{b}, \vec{B} \rangle) \\ &+ \frac{\partial}{\partial \vec{B}} \langle Q\_{M^{\text{bd}}}\vec{B}, \Gamma \, Q\_{M^{\text{bd}}}\vec{B} \rangle \\ &- 2 \frac{\partial}{\partial \vec{B}} \langle Q\_{M^{\text{bd}}}\vec{B}, \Gamma Q\_{M}\overrightarrow{EAD} \rangle \\ &= 2A\vec{B} + \vec{b} + 2Q\_{M^{\text{bd}}}\Gamma Q\_{M^{\text{bd}}}\vec{B} - 2Q\_{M^{\text{bd}}}\Gamma Q\_{M}\overrightarrow{EAD} \\ &= H\vec{B} - \vec{f}, \end{split} \tag{33}$$

where we have used the notations (30) and (31). This shows (29). Further, we note that the matrix *H* is derived from correlation matrices and therefore positive semidefinite. As a result, *H* is indeed invertible. Moreover, it holds

$$\frac{\partial^2 \sigma\_{\text{sym}}^2}{\partial^2 \vec{B}} = H.$$

Hence, the second derivative of σ<sup>2</sup> *syn* is positive semi-definite and *B*<sup>∗</sup> is indeed a minimum.

<sup>16</sup>All terms are introduced in the proof.

*Remark* The implementation of the optimal hedge strategy works as follows: one has to compute on a regular basis (e.g. daily, weekly, etc.) the optimal solution (29). To do this one needs the CVA sensitivities,17 the trading book sensitivities, and the correlation matrix of the risk factors.18 Afterwards, the CVA desk needs to buy credit protection described by the optimal solution. This reduces the capital demand for counterparty risk and (by construction) minimizes the accounting P&L of the bought credit protection.

The approach presented in this article is based on many simplifying assumptions and restricted to the standardized CVA risk charge. Obviously, one could relax these assumptions and apply a comparable optimization principle. In such a case, it would possibly be hard to derive an analytical solution. Instead, one would obtain a numerical solution.

# *5.1 Special Cases*

For illustration purposes, we consider the case *n* = 1, i.e. the special case of a single netting set. In that case both *H* and *f* are scalars:

$$H = 2\Delta^2 \Sigma\_{1,1} + 2\boldsymbol{\rho}^2 (\boldsymbol{M}^{\boldsymbol{h} \boldsymbol{d}})^2$$

and

$$f = 2\omega^2 M M^{\text{had}} EAD + 2\Delta\Delta\_{CVA}\Sigma\_{1,1} - \left(\Delta\Sigma\_{1,1}\Delta\_{\text{rext}} + \Delta\sum\_{j=2}^{N} \Sigma\_{1,j}\Delta\_j\right), \quad (34)$$

whereby Δ describes the sensitivity of the hedge instrument of the considered counterparty, Δ*rest* the sensitivity of the remaining positions (i.e. all positions without the CDS used for hedging purposes), Δ*CVA* the sensitivity of accounting CVA and Δ*<sup>j</sup>* are the sensitivities to the risk factors of the remaining positions. Thus, the optimal solution is

$$B^\* = \frac{2\alpha^2 M M^{hd} EAD + 2\sigma^2 \Delta \Delta\_{CVA} - \left(\Delta \sigma^2 \Delta\_{rest} + \Delta \sum\_{j=2}^N \Sigma\_{1,j} \Delta\_j\right)}{2\Delta^2 \sigma^2 + 2\alpha^2 (M^{hd})^2} \tag{35}$$

where we have used that Σ<sup>1</sup>,<sup>1</sup> is equal to the volatility σ<sup>2</sup> of the hedge instrument. First, in order to get a better understanding of *B*∗, let us assume that the risk factor (credit spread) of the hedge instrument is independent of the remaining positions, i.e.

<sup>17</sup>Banks which actively manage their CVA risk usually compute these sensitivities.

<sup>18</sup>Larger banks usually have these data available, e.g. for market risk management purposes.

Δ*rest* = 0 and Σ1,*<sup>j</sup>* = 0, for *j* = 2,...,*N*. In that case (35) (we assume additionally *M* = *Mhed* ) becomes

$$B^\* = \frac{2\omega^2 M^2 EAD + 2\Delta\Delta\_{CVA}\sigma^2}{2\omega^2 M^2 + 2\Delta^2\sigma^2}.\tag{36}$$

We see already that *B*<sup>∗</sup> is (at least from a certain volatility level) a decreasing function in σ2, as we would expect it. Obviously, if we ignore the fact that the hedge instrument introduces further volatility (i.e. we assume σ<sup>2</sup> = 0), it holds

$$B^\* = EAD.$$

It is easy to see that this is the optimal hedge amount if we minimize the CVA risk charge alone. As explained above, the most significant differences between the IFRS CVA and the regulatory CVA are the different exposure computation methodologies. In (36), these differences are reflected in *EAD* and Δ*CVA*: while *EAD* is based on the regulatory methodology, Δ*CVA* is based on accounting CVA methodology.19 For illustration purposes, let us assume that Δ*CVA* is based on the same exposure methodology as the regulatory CVA sensitivities (and that the modeling assumptions Sect. 3 holds). This means, that cf. (15)

$$
\Delta\_{CVA} = EAD\Delta,\tag{37}
$$

i.e. we use the regulatory exposure *EAD* in (15) instead of the economical exposure *EE*∗. If we plug in (37) in (36), we obtain:

$$B^\* = \frac{(2\omega^2 M^2 + 2\Delta^2 \sigma^2)EAD}{2\omega^2 M^2 + 2\Delta^2 \sigma^2} = EAD. \tag{38}$$

Thus, if we ignore the mismatch between the accounting and the regulatory CVA, the optimal hedge solution is given by the optimal hedge solution of the CVA risk charge only. If we include the mismatch, we can approximate the accounting CVA sensitivity by (cf. (15))

$$
\Delta\_{CVA} = EE^\* \Delta. \tag{39}
$$

As explained in Sect. 4.3.1, *EE*<sup>∗</sup> is smaller than *EAD*. Using (36) and (39) yields:

$$B^\* = \frac{2\alpha^2 M^2 EAD + 2\Delta\sigma^2 EE^\*}{2\omega^2 M^2 + 2\Delta^2\sigma^2} < \frac{2\alpha^2 M^2 EAD + 2\Delta\sigma^2 EAD}{2\alpha^2 M^2 + 2\Delta^2\sigma^2} = EAD. \tag{40}$$

Hence, the mismatch leads to a smaller optimal hedge amount than the current regulatory exposure.

<sup>19</sup>Note that Δ*CVA* depends on the exposure as well (while Δ is based on a unit exposure, cf. (16)). But this exposure is computed based on accounting methodology. This is the main source of differences between the accounting and regulatory regimes.

We remark that it cannot be excluded that *B*<sup>∗</sup> becomes negative. This is the case if the risk factors of the remaining positions are strongly correlated to the risk factor of the hedge instrument. In such a situation it seems to be reasonable to set *B*<sup>∗</sup> = 0.

**Acknowledgements** The KPMG Center of Excellence in Risk Management is acknowledged for organizing the conference "Challenges in Derivatives Markets - Fixed Income Modeling, Valuation Adjustments, Risk Management, and Regulation".

**Open Access** This chapter is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, duplication, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, a link is provided to the Creative Commons license and any changes made are indicated.

The images or other third party material in this chapter are included in the work's Creative Commons license, unless indicated otherwise in the credit line; if such material is not included in the work's Creative Commons license and the respective action is not permitted by statutory regulation, users will need to obtain permission from the license holder to duplicate, adapt or reproduce the material.

# **References**


# **Capital Optimization Through an Innovative CVA Hedge**

**Michael Hünseler and Dirk Schubert**

**Abstract** One of the lessons of the financial crisis as of late was the inherent credit risk attached to the value of derivatives. Since not all derivatives can be cleared by central counterparties, a significant amount of OTC derivatives will be subject to increased regulatory capital charges. These charges cover both current and future unexpected losses; the capital costs for derivatives transactions can become substantial if not prohibitive. At the same time, capital optimization through CDS hedging of counterparty risks will result in a hedge position beyond the economic risk ("overhedging") required to meet Basel II/III rules. In addition, IFRS accounting rules again differ from Basel, creating a mismatch when hedging CVA. Even worse, CVA hedging using CDS may introduce significant profit and loss volatility while satisfying the conditions for capital relief. An innovative approach to hedging CVA aims to solve these issues.

**Keywords** CVA · Hedging · CDS · Contingent financial guarantee · Risk charges · OTC derivatives

# **1 Preface**

In the following the nexus between credit risk (counterparty risk), liquidity, and market risk is analyzed and a solution with respect to CVA hedging of OTC derivative contracts is proposed.

The starting point is the consideration of collateral and its respective recognition in different but "basic" financial instruments like repos and (partially un-) collateralized

M. Hünseler (B)

Credit Portfolio Management, Assenagon Asset Management S.A., Zweigniederlassung München, Prannerstrae 8, 80333 Munich, Germany e-mail: michael.huenseler@assenagon.com

D. Schubert

Financial Services KPMG AG Wirtschaftsprüfungsgesellschaft, The Squaire, Am Flughafen, 60549 Frankfurt am Main, Germany e-mail: dschubert@kpmg.com

OTC derivative contracts as well as the comparison to corresponding uncollateralized financial instruments like money market loans or uncollateralized OTC derivative contracts. The role of collateral is analyzed with respect to its legal basis, its treatment in Financial Accounting (IFRS, refer to [4]) and regulatory reporting according to Basel II/III (cf. [1, 2]).

The analysis leads to a definition of the concept of liquidity and its relation to the use of collateral in financial markets. As will be shown, the concept of liquidity, inherent in the legal framework related to collateral of basic financial instruments, can be considered as a transformation of secured into unsecured financing and vice versa. Moreover, with respect to the associated valuation and risk the liquidity transformation exhibits similarities to the concept of wrong-way risk. The transformation of unsecured into secured financing can be used to derive new types of financial instruments, e.g. in the application to CVA hedging issues of OTC derivative contracts. In this case the hedging instrument also solves the issue of disentangling funding value adjustments (FVA) and counterparty value adjustments (CVA), which is intensively discussed by practitioners in context with the pricing of OTC derivatives.

# **2 The Role of Collateral in OTC Contracts and Its Legal Basis**

In the following the main legal basis with respect to the role of collateral is outlined.

# *2.1 The Role of Legal Versus Economic Ownership*

There are two main properties which are of relevance in connection with the role of collateral, the transfer of legal ownership (i.e. the possibility of "re-hypothecation") in contrast to the economic ownership and the value of the collateral.

By entering into a repurchase agreement the legal title to the securities is transferred to the counterparty but economically the securities stay with the selling counterparty since the buying counterparty has the obligation to compensate the selling counterparty for income (manufactured payments) associated with the securities and to redeliver the securities. In case of an Event of Default, both obligations terminate. The treatment in an Event of Default provides that the residual claim is settled in cash and determined taking into account the cash side as well as the value of the collateral. In this case the obligation to redeliver securities transferred as collateral expires and the buying counterparty remains the legal owner. Thus the price risk of the collateral (uncertainty of value) is entirely borne by the legal owner.

In case of (only) economic ownership, e.g. a pledge, this is not necessarily the case, since the treatment in an Event of Default differs as e.g. this kind of "collateral" is part of the bankrupt/legal estate and therefore underlying the insolvency procedure. Despite these legal differences, the regulatory rules according to Basel II/III and the accounting rules under IFRS also require different treatment of collateral. In general IFRS follows the economic ownership concept irrespective of the legal basis of the collateral while Basel II/III rather follows the legal ownership concept.

# *2.2 Affected Market Participants*

Not all market participants are affected by the same accounting and regulatory rules. Banks have to follow IFRS and Basel II/III rules, while e.g. investment funds are not affected by Basel II/III rules but are governed by investment fund legislation, e.g. UCITS directive. These different legal frameworks for market participants impact the usage of collateral in OTC contracts, e.g. the assets of an investment fund under UCITS represent special assets and the use of repos and cash collateral is limited. In addition, these investment funds have no access to sources of liquidity other than the capital paid which limits the use of cash and the provision of cash collateral in context of derivatives exposure. For example, cash collateral received from OTC derivative contracts has to be kept in segregated accounts and cannot be used for any kind of (reverse) repo transaction. Alternatively, the use of a custodian for optimizing the provision of cash collateral can be considered.

# *2.3 Financial Instruments Involving Collateral and Standard Legal Frameworks (Master Agreements)*

Analyzing the legal basis of collateral facilitates the definition of liquidity and liquidity transformation.

# **2.3.1 Derivatives Under ISDA Master Agreement**

The type and use of collateral are governed in the CSA (credit support annex), which represents an integral part of the ISDA Master Agreement framework1 and cannot be considered separately. The ISDA Master Agreement forms the legal framework and is applicable for the individual derivative contracts supplemented by the CSA. For example, default netting in the Event of Default (default of a counterparty) is governed by the ISDA Master Agreement including the netting of the collateral which in turn is defined in the CSA. The CSA defines the type(s) of collateral and the terms of margining/posting, while the transfer of the legal ownership is governed in the ISDA Master Agreement. In general ISDA Master Agreements contracted under English Law provide the legal transfer of ownership of the collateral while ISDA Master

<sup>1</sup>ISDA®, International Swaps and Derivatives Association, Inc., 2002 Master Agreement.

Agreements contracted under New York Law do not. In the latter re-hypothecation, i.e. the re-use of the received collateral for counterparties is prohibited.

In case of ISDA Master Agreements under English Law the derivative contracts are terminated in case of an Event of Default and the collateral is taken into account in order to determine the residual claim. The determination of the residual claim is performed independently from the estate of the insolvent party.

#### **2.3.2 Repos Under GMRA**

A repo or repurchase agreement under GMRA<sup>2</sup> can economically be seen as a collateralized loan and is typically motivated by the request for cash. In case of repurchase agreements, the legal title to the securities provided as collateral is transferred to the counterparty (buyer) in exchange of the desired cash (purchase price). The credit risk and liquidity of the underlying securities determine the haircut in the valuation of the collateral. Adverse changes in the inherent credit risk of the securities are offset by an increase in haircut and induce in terms of margining additional posting of collateral to the counterparty. At maturity the securities are legally transferred back to original owner (seller) in exchange for the agreed cash amount (repurchase price). In case of a counterparty's default the securities are not returned and the recovery risk of the securities is borne by their legal owner (the buyer).

#### **2.3.3 Securities Lending Under GSLMA**

In contrast to a repo, a securities lending under GSLMA3 is motivated by the need for securities but is (commonly) also a secured financing transaction since the securities as well as the collateral are legally transferred to the respective counterparty. In the secured case the collateral can be cash or other securities.

# *2.4 Credit and Counterparty Risk Related to Collateral*

Consider the case that Bank 1 and Bank 2 enter into a repo transaction, where Bank 2 receives cash from Bank 1in return for securities. There are two features of importance: Bank 1 needs cash funding, which requires an assumption with respect to the sources of funding, e.g. central bank, deposits. The corresponding assumption represents a component in determining the profitability of the repo. An additional feature is the inherent wrong-way risk within the repo transaction. In this case the

<sup>2</sup>Sifma, Securities Industry and Financial Markets Association and ICMA, International Capital Market Association, 2011 version Global Master Repurchase Agreement.

<sup>3</sup>ISLA, International Securities Lending Association, Global Master Securities Lending Agreement, Version: January 2010.

wrong-way risk for Bank 1 is defined as an adverse correlation (positive in the example above) between counterparty credit risk toward Bank 2 and market value of the collateral (securities). Assuming a long position in the underlying securities (collateral) for Bank 1, the wrong-way risk constitutes a decrease in value of the securities (collateral) and a simultaneous decrease in credit worthiness of Bank 2. In this case the risk for Bank 1 is the failure of Bank 2 in balancing the collateral posting. Since in a repo transaction the legal ownership is transferred to Bank 1, the net risk position comprises the price risk (in the Event of Default of Bank 2) associated with the collateral (securities) including the haircut and the cash claim (cash loan). A similar rationale holds in case of a short position in securities (collateral) since an event of default affects the ability to post as well as to return posted collateral. Similar considerations hold in case of a (partially) collateralized OTC derivative transaction, e.g. an interest rate swap.

# **3 Terms of Liquidity and Definition of Liquidity Transformation**

Dealing with the concept of liquidity reveals that the term is not defined consistently or not uniformly in financial regulations. A natural way is to adopt legal definitions.

# *3.1 Terms of Liquidity*

There is a variety of definitions for the term liquidity, e.g. meeting payment obligations (liquidity of an entity), liquid marketable securities (ability to buy and sell financial instruments), etc. The analysis above reveals the interdependence of "liquidity" and counterparty credit risk, respectively credit risk. As such liquidity of an entity can be considered as the relatively measured ability for a bank to raise cash from a credit line or in return of collateral which in turn is dependent on the liquidity of financial instruments. The collateral itself is only accepted if the price of the collateral can be reliably determined, e.g. it is traded with sufficient frequency on an active market.

# *3.2 Comparison of Secured and Unsecured Financing*

The best way to illustrate the concept formation of liquidity respectively liquidity transformation is the comparison of unsecured and secured financing in case of a default event. Continuing the example above, the following comparison considers Bank 1 as cash provider.

1. Financial action


2. Prerequisite and term of liquidity


3. Net (relative) risk position in case of default


4. Relation to estate of insolvent party


5. Risk


Note that in the comparison above the net (relative) risk position in both cases, for secured and unsecured financing, involves a recovery rate but the associated risk relates to different counterparties. In case of secured financing the default risk is coupled with the recovery risk (price risk) of the collateral and the risk position can be settled promptly in case of a default while in case of the unsecured financing the settlement of the recovery depends on the insolvency process.

This comparison in particular shows that the credit risk toward the counterparty in the unsecured financing transaction being rather illiquid is opposed to the market value risk of the received collateral which is assumed to be liquid in the secured case plus the correlation of this risk and the credit risk of the issuer of the securities taken as collateral. In the adverse case this risk correlation is also known as "wrong way risk".

# *3.3 Liquidity Transformation*

Accordingly considering liquidity as an absolute quantity is not useful but as a relative quantity: a relation between secured financing and unsecured financing, which we term liquidity transformation. This transformation is not independent from credit respective counterparty risk, since each type of financing is associated with a different type of credit risk. The liquidity transformation is dependent on the type of entity and cannot be considered separately from its legal status. A bank has different access and a higher degree of freedom to assign liquidity irrespective of the purpose than, e.g. an investment fund.

# **4 New Approach to CVA Hedging**

The new CVA hedging approach outlined below represents a response to current challenges in banking regulation and reveals the importance of liquidity transformation. The legal-based background described above can be used to explain current challenges of banking industry if in addition to prevailing market conditions the regulatory and financial accounting environments are taken into account. Recent environmental changes have immediate impact on banking business activities concerning counterparty risk and can be summarized as follows:

Regulatory and Accounting Aspects


Business Impact


# *4.1 Issue*

During the financial crises regulators and financial accounting setters notified the relevance of counterparty credit risk in OTC derivative contracts. In response to this relevance several regulatory (legislative) initiatives have been undertaken like central clearing, increased regulatory capital, etc. These impacted the business of banking industry in several ways: intensified use of credit risk mitigation techniques and increased demand for secured transactions (demand for collateral, cf. also [3]).

Despite the environmental changes credit risk mitigation is and remains essential to continue banking business. Considering equity as a scarce source, banks are forced to tighten their credit exposure in order to offset the increase in capital charges due to increased costs for CCR and other factors. The tightening of credit exposure limits banking business and increases the demand for credit risk mitigation techniques (including hedging).

The mentioned regulatory changes induce tremendous costs for the banking industry. Therefore, managing credit risk by commonly used CDS hedging strategies becomes expensive in presence of the banking regulation, so credit risk management will be rearranged, e.g. more offsetting positions, avoiding exposures (reducing limits) or transferred ("outsourced") outside the regulated banking sector, so e.g. investment funds are in a favorable position to manage a bank's risks. This also holds for counterparty credit risk following the idea to transfer counterparty credit risk to market participants outside the banking sector that are in the situation to manage this risk economically at lower cost than banks.

Additionally banking industry is faced with various different regulations. With respect to counterparty credit risk a bank is confronted with conflicting objectives resulting from regulatory requirements, i.e. Basel II/III, and financial accounting rules. Therefore, under current regulatory and accounting requirements banks cannot manage counterparty credit risk (CCR) of derivatives uniformly in respect of capital requirements and P/L volatility. This results from the fact that the hedging of counterparty credit risk exposure (in terms of Basel II/III requirements) requires the hedging of current and future changes of exposure, while IFRS only considers current exposure. So a bank is required to hedge more than the current exposure ("overhedging") in terms of Basel II/III. But since hedging is mainly carried out by derivatives as CDS, these CDS cause P/L volatility under IFRS, since derivatives are recognized at fair value through P/L.

As described above secured and unsecured financing is common practice in finance industry and can be observed in counterparty credit risk of OTC derivative contracts. As illustrated below in an uncollateralized OTC derivative trade between Bank A and counterparty B, the parties enter into an unsecured financing relationship. If the market value of the derivative trades of Bank A against counterparty B increases then Bank A is exposed to counterparty credit risk (CVA risk). Bank A implicitly provides counterparty B an illiquid credit line in the sense, that the positive exposure amount ("market value") is recognized as an asset which becomes a legal claim in the Event of Default. This exposure is not a tradable asset but needs to be funded thus it could be interpreted as an illiquid asset. In comparison to standard banking credit business, this credit line is unlimited and varies with the market value of the underlying derivative trades, which implies also unlimited funding. The current focus of discussions and research concentrates on measuring counterparty credit risk by exposure and default probability modeling (CVA risk) and the assignment of the appropriate discount rate for the OTC derivative trades reflecting the FVA. The discussed approaches share the following assumptions:


These ideal assumptions are not necessarily met in reality, therefore alternative approaches have to be explored.

# *4.2 Solution*

Since banks with significant activities in derivatives markets can be affected quite heavily by the aforementioned issues, a workable solution should solve the build-in conflict of regulatory and accounting requirements. As a result, the solution contributes to an improved competitiveness of the bank in the context of derivative risk management, derivatives' pricing, and support the bank in conducting derivatives business which will ultimately benefit the economy as a whole. Consequently, a potential solution is about developing a financial instrument ("credit risk mitigating instrument") which reduces the Basel II/III CCR capital requirements and CVA risk charge without resulting in additional P/L volatility under IFRS. Such a financial instrument represents a solution to the issues described above since it creates:


The outline of a solution follows the liquidity transformation. The unsecured financing for OTC derivatives would be represented by uncollateralized OTC derivatives while secured financing requires corresponding posting of collateral. Pursuing the

**Fig. 1** Secured OTC derivative transaction

aim of decoupling liquidity and counterparty risk, at least three parties are necessary to involve as demonstrated in the analysis on repos above. Therefore, the aim could not be achieved by cash collateralized bilateral OTC derivatives commonly used in the interbank market, since there is still a one-to-one correspondence between liquidity requirements (e.g. cash collateral postings) and counterparty risk. Additionally a bilateral CSA assumes that both counterparties have unlimited access to liquidity, which represents a difficulty if counterparty B is a corporate according to its limited access to collateral/cash. Therefore a secured financing transaction for CVA hedging has to be structured differently.

The secured financing transaction outlined in Fig. 1 involves a third party "Default Risk Taker" C, who is posting collateral to Bank A on behalf of counterparty B, i.e. whenever the value of the derivative trade is positive for Bank A. This transaction represents a tri-party CSA and works similar to a margining. The transaction between "Default Risk Taker" C and Bank A is an asymmetric contract, since if the value of the derivative trade is negative for Bank A, no collateral is provided to or by Bank A. In case of a default of counterparty B the posted collateral is not returned to "Default Risk Taker C". The structure described above represents the appropriate complement for a bilateral uncollateralized OTC derivative transaction.

The structure reveals the concept of liquidity transformation including a decoupling of liquidity and counterparty risk, since by using the contract the unsecured financing transaction is transformed into a secured financing transaction. Referring to the comparison of unsecured and secured financing described above (cf. Sect. 2.3), the proposed structure goes one step further by linking both market segments and transforming liquidity within one single transaction. By definition of the liquidity transformation, the transaction exchanges different types of credit risk.

# *4.3 Application*

The table in Fig. 2 shows the contemplation of the new CVA hedge structure (cash collateral with contingent financial guarantee, "CCCFG"; for more detail refer to [5]) to existing credit risk mitigation techniques applied in the banking industry. Its main features are summarized as follows:

• The proposed structure represents a credit risk mitigating instrument, which reduces the Basel II/III CCR capital requirements CVA risk charge, since the cash collateral provided by a third party is permitted under Basel II/III requirements and reduces the exposure according to Basel II/III.


**Fig. 2** Current and new approaches for credit risk mitigation in banking industry


# *4.4 Example*

In the following for the sake of simplicity only a qualitative example is provided, since by comparing the induced costs the CVA hedge already indicates its profitability.


With respect to the risk illustrated in the first line in the table above, the CVA hedge transaction mitigates entirely the risk of Bank A by transferring the risk to investment fund C. This results from the posted cash collateral of Investment Fund C to Bank A on behalf of counterparty C. Comparing the induced costs (second line in the table above) reveals that the (uncollateralized) derivative business is exposed to regulatory and cost of equity charges as well as funding costs. In case of the CVA hedge transaction all these costs are inapplicable, since the posted cash collateral by Investment Fund C to Bank A on behalf of counterparty B leads to entire regulatory capital and cost of capital relief and serves as funding to the derivative exposure between Bank A and counterparty B. On the other hand Bank A pays a fee to Investment Fund C for taking over the counterparty credit risk of B and also interest

<sup>4</sup>In order to keep legal and operational complexity in an event of default low one netting set is considered.


**Fig. 3** Comparison derivatives exposure with and without CVA Hedge transaction from bank A's perspective

on the posted cash collateral. Describing the associated cash flow profiles the two situations, default and non-default of the counterparty, are distinguished (third line in the table above). While in case without CVA hedge structure the cash profiles are straightforward, with CVA hedge transaction in addition fee and interest payments on the collateral have to be considered in the non-default situation. In the event of default of counterparty B, the residual claim of the transaction is physically delivered to Investment Fund C in return for cash equal to the notional of the residual claim. This procedure follows standard ISDA rules (Fig. 3).

# **5 Conclusion**

The new CVA hedging instrument is used in order to transfer counterparty credit risk to entities which are able to manage the risk on an economic basis at lower cost. Investment funds can act as "credit risk taker" and manage counterparty credit exposure at a lower cost than banks, since investment funds are not subject to regulatory capital requirements according to Basel II/III. It has to be noted though that an implementation of the solution described above requires an intense capability and knowledge of dealing with derivatives at the risk taking investment funds. On the other hand, since investment funds are not subject to the same regulations as those for banks described above they may become a natural partner for banks in this context.

The proposed structure bridges the difference between capital rules and financial accounting standards in order to optimize capital requirements and charges for CVA. This is achieved by its liquidity transformation property—the liquidity and credit risk transformation of the counterparty's exposure—and by meeting the Basel II/III and IFRS requirements: simultaneous CCR capital and CVA risk charge relief as well as reduced P/L volatility in IFRS resulting from CVA accounting. While the objective outlined herein is predominantly to provide a suitable solution for CVA issues in context of derivatives transactions, it may also create interesting opportunities for investors of the risk taking investment funds.

This solution also contributes to valuation and the discussion on FVA and CVA, since it requires the pricing of the collateral between counterparties "at arm's length". This price determines the discount rate by applying the absence of arbitrage principle. As a consequence FVA is disentangled from CVA by using the proposed structure as a mean.

**Acknowledgements** The KPMG Center of Excellence in Risk Management is acknowledged for organizing the conference "Challenges in Derivatives Markets - Fixed Income Modeling, Valuation Adjustments, Risk Management, and Regulation".

**Open Access** This chapter is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, duplication, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, a link is provided to the Creative Commons license and any changes made are indicated.

The images or other third party material in this book are included in the work's Creative Commons license, unless indicated otherwise in the credit line; if such material is not included in the work's Creative Commons license and the respective action is not permitted by statutory regulation, users will need to obtain permission from the license holder to duplicate, adapt or reproduce the material.

# **References**


# **FVA and Electricity Bill Valuation Adjustment—Much of a Difference?**

**Damiano Brigo, Christian P. Fries, John Hull, Matthias Scherer, Daniel Sommer and Ralf Werner**

**Abstract** Pricing counterparty credit risk, although being in the focus for almost a decade by now, is far from being resolved. It is highly controversial if any valuation adjustment besides the basic CVA should be taken into account, and if so, for what purpose. Even today, the handling of CVA, DVA, FVA, *...* differs between the regulatory, the accounting, and the economic point of view. Eventually, if an agreement is reached that CVA has to be taken into account, it remains unclear if CVA can be modelled linearly, or if nonlinear models need to be resorted to. Finally, industry practice and implementation differ in several aspects. Hence, a unified theory and treatment of FVA and alike is not yet tangible. The conference *Challenges in Derivatives Markets*, held at Technische Universität München in March/April 2015, featured a panel discussion with panelists representing different points of view: John

D. Brigo (B)

C.P. Fries Department of Mathematics, LMU Munich, Theresienstrasse 39, 80333 Munich, Germany e-mail: christian.fries@math.lmu.de

J. Hull

M. Scherer Lehrstuhl für Finanzmathematik, Technische Universität München, Parkring 11, 85748 Garching-Hochbrück, Germany e-mail: scherer@tum.de

D. Sommer KPMG Financial Risk Management, The Squaire am Flughafen, 60549 Frankfurt, Germany e-mail: dsommer@kpmg.com

R. Werner Professur für Wirtschaftsmathematik, Universität Augsburg, Universitätsstraße 14, 86159 Augsburg, Germany e-mail: ralf.werner@math.uni-augsburg.de

© The Author(s) 2016 K. Glau et al. (eds.), *Innovations in Derivatives Markets*, Springer Proceedings in Mathematics & Statistics 165, DOI 10.1007/978-3-319-33446-2\_8

Department of Mathematics, Imperial College London, London, UK e-mail: damiano.brigo@imperial.ac.uk

Joseph L. Rotman School of Management, University of Toronto, 105 St George St, Toronto, ON M5S 3E6, Canada e-mail: hull@rotman.utoronto.ca

Hull, who argues that FVA might not exist at all; in contrast to Christian Fries, who sees the need of all relevant costs to be covered within valuation but not within adjustments. Damiano Brigo emphasises the nonlinearity of (most) valuation adjustments and is concerned about overlapping adjustments and double-counting. Finally, Daniel Sommer puts the exit price in the focus. The following (mildly edited) record of the panel discussion repeats the main arguments of the discussants—ultimately culminating in the awareness that if everybody charges an electricity bill valuation adjustment, it has to become part of any quoted price.

**Keywords** Counterparty credit risk · Credit valuation adjustment · Debit valuation adjustment · Wrong way risk

# **1 Welcome**

*Matthias:* Welcome back from the coffee break. After the many interesting talks we already enjoyed today, we will now continue the conference with a panel discussion on current issues in counterparty credit risk. And we are very proud to present you such prestigious speakers on this topic—our anchorman Ralf Werner will introduce them to you in a minute (Fig. 1).

We hope that this discussion will provide you with insights on the current discussion about CVA, DVA, FVA, etc. that go beyond what you can read in scientific papers. In my personal view, these valuation adjustments are a special topic in financial mathematics, because they are not simply expressed by formulas some mathematicians invent and you implement in a spreadsheet. In contrast, these adjustments are chal-

**Fig. 1** View on the panel. From *left* to *right*: Matthias, Ralf, Daniel, Christian, Damiano, and John

lenges a whole bank has to work on as a team, because they can involve different departments, different asset classes, different trading desks, the IT-infrastructure, lots of data, etc. Hence, it is not something that is "done" after a scientific paper has been published. Moreover, there is no consensus—neither in academia nor in practice—on what adjustments should be used and how they must be computed. In this regard, I am very happy to see representatives from the financial industry as well as from academia gathering for this discussion.

I will now pass the microphone to Ralf Werner who will be our anchorman. Ralf is professor for "Wirtschaftsmathematik" at Augsburg University. Prior to this he was professor at the University of Applied Sciences in Munich, and prior to this he worked for several financial institutions—most of which have defaulted.

*Ralf:* Yes, indeed. Three in total.

*Matthias:* In any case, he gained quite some experience—practical and theoretical—with credit defaults that he is now sharing with you. Thank you very much Ralf!

*Ralf:* Thank you, Matthias, and a warm welcome to everybody from my side. I'm very honoured to chair this discussion. I don't think I will need to do much because we already had an excellent warm-up over lunchtime, and my experience is that these four experts in the panel won't need much input from my side to keep the discussions controversial, yet fruitful.

For the unlikely event that the discussion might get stuck, we have prepared a few additional questions. Further, any question or comment from the audience will be addressed immediately, i.e. we will interrupt whenever possible and whenever meaningful.

The idea is that each discussant has about ten minutes to address one or more topics he deems important. I'll try to dig a bit deeper and if you like you join in asking and eventually after 15 minutes we hand over to the next discussant. This means that in one hour we should be able to pretty much cover everything concerning DVA, FVA, CVA, multi-curve, whatsoever, within the scope of the conference.

Let me now introduce the participants in reverse alphabetical order. I would like to start with Daniel Sommer to my left. Daniel is not only representing one of the main sponsors of this conference, but he's further representing almost 20 years of experience in financial consulting. Daniel is a member of the financial risk management group at KPMG, and for more than ten years he's responsible partner for risk methodology. Daniel holds a PhD on interest-rate models from the University of Bonn, he has published several papers, he is working for all major banks in Germany, so in short he comes with a broad experience of what's going on in the market. I think this is an excellent opportunity for us to challenge his knowledge and his experience.

On the other end of the panel we have John Hull. I both asked John as well as Damiano during the lunch break, and we agreed that re-introducing both of them after we had such great and detailed introductions this morning prior to their talks is saying the same thing twice over. John will hopefully talk a bit about FVA, and I assume all of you have read his 2012 paper, see [7]. If not, my introduction may last another 60 s, so please at least run through the abstract of this great paper. It's an excellent work, starting heavy discussions in the community, I'd like to say—fruitful discussions, raising lots of interesting questions on FVA: *Is it really there? Should it be zero or not?* For me, somehow, the discussion is not yet over, so I am looking forward to what John has to say.

Besides John, between Daniel and Damiano, we have Christian Fries, our local panel member from the LMU. Christian was appointed professor for Financial Mathematics a few years ago. I should emphasise that besides his academic duties he is still mainly working at DZ BANK where he is responsible for model development, heading this department. Of course, I think you all know Christian from his opensource library and from his book, resp. on Monte Carlo methods in finance [5], and I'm sure we will gain a lot of insight from this mixed-role in practice and academia.

And, finally, we have Damiano Brigo with us, whom I would like to start right away without any further notice, so please, Damiano.

# **2 Damiano Brigo**

*Damiano:* Okay, thank you. I made some of the points during the presentation, but I think it's worth summing up a little bit what's been happening from my point of view. I worked on what is now called CVA since, I think, 2002 or 2003 at the bank. At the time it was called counterparty risk pricing, not CVA, and nobody was really very interested because the spreads were small for most of the trades and so on, so the work was recycled a few years later, especially in 2007. But as we did that it was clear that this was only a small part of a much broader picture where we had to update the valuation paradigms used in investment banking and not only there (Fig. 2).

The big point that seems to come out, at least methodologically, from that big picture is nonlinearity, which shows up in a number of aspects that can or may be neglected in many cases but not always. So one of the aspects is the close-out, what

**Fig. 2** Damiano Brigo giving his presentation on "Nonlinear valuation under credit gap risk, collateral margins, funding costs, and multiple curves"

happens at default. What do you put in your simulation? Should you use a risk-free close-out, where at the first default you just stop and present-value the remaining cash flows without including any further credit, collateral, and funding liquidity effects? Or should you rather use a replacement closeout, where those effects are all included in the valuation at default?

This is a big question. If you go for the replacement, then the problem as we have seen becomes recursive, if you like, or nonlinear from a different point of view. And that's not because we mathematicians are trying to push BSDEs or semi-linear PDEs on you. It's simply because of the accounting assumptions. It's a basic fact, an accounting rule that says that you have to value your deal at default using a replacement value. This is a simple accounting rule, but it translates into a quite nightmarish nonlinear constraint in the valuation. Then when borrowing and lending rates are asymmetric in financing your hedge, if they are justified to be, then you have another source of nonlinearity because to price these costs of carry you need to know the future value of the hedge accounts and of the trade itself. And this induces another component of nonlinearity (see [4] and [3]).

If it's there or not depends on the funding model you adopt for your treasury. If the trading desk is always net borrowing and possible liquidity bases are symmetric, you don't have that, and you can more or less have a symmetric problem, but if it's not net borrowing then you do have an asymmetry in the funding rate: one is the credit risk of your bank, one is the credit risk of the external funder, plus liquidity bases. So, we all know that borrowing and lending don't happen at the same rates usually (well, we experience it personally, at least).

So, the nonlinearity is there. The big question is *Should we embrace it or keep it at arm's length?*, because it makes things too complicated in practice. The answer is the second one, and basically if there is any real nonlinearity in the picture, the required methods like BSDE's or semilinear PDE's are very hard to implement on large portfolios in an efficient way that ensures that you can value the book many times during trading activity very quickly—especially because nonlinearity means the price or the value is not obtained by adding up the values of the assets in the portfolio, so you need to price the portfolios at all the possible aggregation levels that you need, and if each component of such a run is slow, you can imagine what kind of operational nightmare you get into. So I don't think it's realistic or feasible at the moment that we embrace nonlinearity. We need to linearise, which means, in the two cases I mentioned, we assume that borrowing and lending rates are the same, which is true for some funding policies, and you also assume that you don't use a replacement closeout at default in the CVA calculation of the valuation adjustment for credit.

Then the other problem I would mention is keeping all the risks in separate boxes with a label on each box: *CVA: this is credit risk, FVA: this is funding cost, LVA: this is collateral cost* and so on. This is a little misleading because these risks interact in the way that I just described. Each cash flow involves the whole future value which depends on all the risks together. The classification in boxes is useful managerially because you want to assign responsibility in an organisation; you cannot have everyone responsible for everything unless you have a very illuminated kind of workplace, but if you don't, you want to assign responsibility for credit risk to the CVA desk, and maybe the funding costs to a different team in the CVA or XVA desk and so on. But if these aspects are so connected as I said, it's very hard to separate the risks in different boxes. Wrong-way risk is another aspect of the fact that the dependence makes the idea that you can have risk taken care of separately by the CVA desk for credit risk and by the traditional trading desk for the trade main market risk not very realistic. To some extent, you can do it, but it's not precise.

So, these are labels that we apply in order to be able to work operationally in a realistic setting, but they don't have the amount of rigour or precision that we would sometimes think they have in practice. So, should we, again, monitor and watch out for manifestations of nonlinearity like overlapping adjustments? We saw that in some set-ups DVA almost completely overlaps with the funding adjustment. And, so, should we be aware of these and avoid the double-counting, or should we forget it and just compute the different adjustments, add them up, and forget about all these overlaps and analyses?

I think it's important to have at least an initial understanding of these issues before throwing ourselves into very difficult calculations. There are many other things I could say. The nonlinearity makes the deal pricing very difficult—in funding costs especially. When you don't know the funding policy of the other institution, or maybe you don't agree with the funding policy of the other institution, but you're still asked to pay their funding prices, you might object and go to another bank, or you might in turn say, *I also have some funding costs, and I want to charge you.* And there is no transparency in the funding model of the treasury process. How can bilateral valuation be achieved in a transparent way? This is another problem.

So a number of authors conclude by saying the funding-adjusted value is a value; it's not a price. You can use it for profitability analysis internally, but you shouldn't charge it outright to a client because it's hard to justify this charge fully, as we have seen. On the other hand and this is the final point I want to raise, which is kind of a meta-topic, I would like to talk about the self-fulfilling aspect in financial methodology, that if two or three top banks start doing something, everybody else follows because this becomes the new standard. *Top bank A is doing this, top bank B is doing this, so we have to do this as well.* And then even if something is not justified based on financial principles, or it is not reasonable methodologically or even mathematically it doesn't matter because if you don't do it you place yourself out of the market.

This is very frustrating for a scientist, for someone who thinks there are underlying sound principles behind what's going on, but in the end you are forced to set the problem aside, because that's what the market is doing, and if you don't follow, you are automatically out.

I would like to conclude with that kind of provocative point, and I'm sure my colleagues will have more interesting points to make on it. Thank you.

*Ralf:* Thank you, Damiano. Is there anyone in the panel who wants to take up one of these points? Or in the audience?

*Christian:* I'd like to ask you, Damiano: you said *close-out* value. This is a very important discussion. So, from my point of view, is this an issue for the lawyers, or is this an issue for financial mathematics? What would you say?

*Damiano:* I think it's an issue for both in a sense, in that the lawyers should tell us if it makes sense to have this close-out there or not based on legal considerations. In the end, I don't think we can decide this with mathematics alone. With mathematics we can say, *If you adopt this close-out, the valuation problem is like this, and if you adopt this other, the valuation problem is like that*, but the decision must be taken based on accounting, financial, and legal principles, not based on mathematics.

I would say that the regulations should converge. We've had ISDA pushing a little towards the replacement close-out, but very mildly. ISDA wrote in 2009 that in determining a close-out amount, the determining party may consider any relevant information, including quotations (either firm or indicative) for replacement transactions supplied by one or more third parties (!) that may take into account the creditworthiness of the determining party at the time the quotation is provided (notice the use of *may*). In the end I think it's a decision for the regulators and the policymakers. We discussed this earlier, but let me be more explicit. Are you thinking, with respect to your operational model, let's say, when the deal has defaulted do you think to actually replace it with a new one or simply to liquidate everything and close the position? This is the real question. If you think to replace it with another physical deal, and you intend to re-start the trade with another contracting party, then you should assume a replacement close-out. If you're thinking of liquidating the position, then it stops here, with a cash settlement, and you may use a risk-free closeout. However, from the point of view of continuity, mathematics seems to suggest that you should include the replacement because you value the trade, mark it to market every day, including credit and funding costs, and all of a sudden at the default event, you remove this. You create a discontinuity in valuation this way, which shows up as some funny effect, which I don't want to go into right now.

I think mathematics gives you some hint, but it's really a regulatory / accounting / legal discussion that we should have, and then use the maths to include the outcome properly into the valuation. That's my view.

*Ralf:* Let me exaggerate a bit, but will this lead into a situation where your line of reasoning is also applied to mortgages or government debt? Would Greece say, *I'll only pay 60 because I'm valued at 50 anyway, so this is the right replacement value?* Will this lead us into such kind of discussions?

*Damiano:* That is very hard to model because when you have such a large market effect, then the close-out itself could change the economy basically, so I don't think it's very realistic in that sense.

In fact, we found in the published paper [1] that there is no superior close-out. If you use the replacement close-out, you have some advantages in terms of continuity and consistency, but you'll have some problems when the correlation goes up towards the systemic risk scenario. In that case the risk-free close-out becomes more sensible economically. There is no clear-cut case, and you cannot make a regulation that depends on correlation or the level of perceived systemic risk switching from one close-out to the other. Can you imagine what happens when you are in the middle. I don't even want to go there (see [2]).

So I think we have to be very careful about the maths, and we have to clearly understand which level of aggregation, of size, we're talking about, and in the case of a country, I think that would be quite dangerous.

*Daniel:* I agree.

*Damiano:* At the global derivatives conference a couple of years ago, I was talking to some of the banking quants and I said, *Which close-out are you using?*, and they would say *We're using the risk-free close-out because that's the only thing we can implement on a large portfolio.*

*Ralf:* I agree. I've heard this is hidden in the recovery rate, anyway.

*Christian:* So maybe I'd like to comment or offer a question on this self-fulfilling prophecy because I do not understand it. I do understand that if there is some idiot in the market who's trading options at the wrong price, then I can use his incorrect pricing to have an implied volatility. Hence, I can imply his dumbness into my model and that's fine. But now you say that everybody is doing it, so we should do it. And I believe this does not apply to FVA. For me, FVA is a real cost and, for example, the market will now decree not to account for FVA, I still picture that I have lost, for example, if I issue a bond at LIBOR plus spread, and just put the money to the ECB for a zero interest rate, I have a loss, right? So, then I would say I would rather go out of the market instead of making the loss.

*Damiano:* Okay, so let me ask you another question. Suppose electricity bills become prohibitive and electricity skyrockets, will you start charging your client an electricity bill valuation adjustment because that's a real cost you're having? Or will this be embedded in the prices like in the old days.

*Christian:* It is.

*Damiano:* When you go and buy some bread from the baker, the baker doesn't charge you a running water and electricity bill valuation adjustment because he needs some water to run his bakery, you know ...

*Christian:* Yeah, but if you go to the bakery, he charges you such that he is covering all his costs.

*Damiano:* That's right.

*Christian:* It's just not transparent.

*Damiano:* That's right.

*Christian:* But the cost is inside the price.

*Damiano:* But then if you add these valuation adjustments one by one, one after the next, every year a new one, with the nonlinearity effects we see that they possibly overlap, you are overcharging sometimes, and this is not good, and that's what I feel is happening.

KVA. Think about it. KVA is a valuation adjustment on capital requirements, but the future CVA potential losses trigger capital requirements—so you have your valuation adjustment on a valuation adjustment. This is getting out of hand.

*Christian:* This point I understand, but that is regulatory …

*Damiano:* But going back to the *self-fulfilling prophecy*, the other thing I wanted to say: think about base correlation. For CDOs base correlation is a model. You use a Gaussian copula, flatten 7,750 correlations into one, apply different flat correlations to each different tranche on the same pool. To explain a panel of 15 CDOs you have 15 different and inconsistent models and then … I kid you not, once at an international conference I met one quant from a top bank who was lecturing about base correlation along the lines of *here's an example of calibration, this is a great model, you should use it, CDOs are great, invest in this.* And when I asked, after his talk, I have some questions for you about this model, he'd say, *Oh, I'm the marketing quant. I don't do models really*. And I said, *Take me to your leader!*, meaning the real quants then, and he said, *Oh, you cannot talk to them; they don't talk to the public. My function is to convince people, investors and the market that this is a great model, this is a great product, and everybody must come into this market.*

However what you are saying is partly true. If the market is kind of complete in a way, then by hedging your strategy according to the correct hedge you can prove that your price is right against an opponent, but if the market is largely incomplete, this is very hard to do. And this is what we look at when we look at funding costs. We don't know the hedging or the funding policy of another entity. It's not transparent. You don't know what they're doing, how they're financing, their short-term/longterm funding policy, their internal fund transfer pricing, their bases. You don't know many things.

*Christian:* This is exactly the point. The market is not complete here, and I cannot pass this risk to someone else. This is my example with the volatility: if someone is on the wrong volatility I can pass this risk to him, but with my funding it's still my risk and it's my cost to cover it. I believe it has to be in there. If you make it transparent, it's something different, maybe.

*Damiano:* Okay, but then you have to really watch out for the overlap as you add new risk. For example, in some formulations if you take into account the trading DVA and also the full funding benefit, you have the same thing twice. You have to be very careful there. So this practice of adding a new adjustment on top of the old ones every year is very dangerous because you may miss some of the overlaps. The banks are paying attention to it; it's not that bad. If it develops in the fact that in ten years we'll have 15 new valuation adjustment, this will be out of control.

*Audience member:* I have a question because I really like this bakery example, so let's say you have one bakery who sells bread for 1.80 and who doesn't have very high electricity costs, and you have another bakery which sells it for 2.00 because they have a lot higher electricity costs. So what is the market price, then? Is it 1.80?

*Damiano:* The price, if you look at a clean price versus an adjusted price, the price would be the clean price without costs. But then, of course, the price is adjusted into an operational price that takes into account the bill, but the bill is not quoted explicitly, it's embedded in the price, so that if you think this baker is too expensive, you'll go to the other one. Maybe the other one is out of town, so they have lower costs because of that.

But in the other industries, we always knew that the price of a good that you end up buying depends on many circumstances that are not in a theoretical price in a way. Somehow ironically, part of the finance industry arrived at this realisation quite late. But that's another matter. I took too much time and I don't want to monopolise this panel.

*John:* Don't forget that we have bid-offer spreads in this industry. Those bid-offer spreads are designed to cover overhead costs, so adding in costs for electricity and other things is not really the way to do it.

*Ralf:* Thank you, John. Let me hand over to Christian. Christian, maybe you want to tell us your opinion on what's going on in financial institutions at the moment. Maybe with some more focus on the practitioner's point of view.

# **3 Christian Fries**

*Christian:* You've asked me to make a few statements and I take the role of the practitioner.

I have the same opinion as Damiano, but I'd like to make the point that I don't like the adjustments. And why? Maybe because the word "adjustment" already implies that you did something wrong. If I have to adjust something, it tells me that the original value is wrong. For example, in my car there is this small device that tells me how long it takes to get from Frankfurt to Munich, and what would I like to see there? With my car it takes five hours. I could also fly. It would take one hour if you take the plane, but you have to add four hours' adjustment. So I would prefer just to see the five because the five is correct. The one hour is no information for me.

Then, let me give you another example. Consider a swap which exchanges LIBOR against a fixed rate, and this swap is traded at a bank, usually at a swap desk, sometimes it's called flow trading. And then we have another swap that exchanges LIBOR capped and floored against the fixed rate; this swap is called a structured swap, and it's traded at a different desk. This desk is sometimes called nonlinear trading desk because these people are doing the nonlinear stuff, but except sometimes for information purposes, we do not express the price of the swap as the price of the linear product plus the nonlinear trading premium. So there is no such thing as an option-valuation adjustment, so we do not have an OVA or something like that.

*Daniel:* Going back a few years, people tried to calculate option-adjusted bond spreads.

*Christian:* Yes, I know, and I am sometimes reminded of it. And so there is one desk in the bank that is taking the responsibility for all this complex stuff. This desk is also making transactions through the swap desk because the desk needs to hedge its interest rate risk, so he's hedging out all linear stuff to the other guys, and he keeps all the nonlinear risks. Let me make a remark about FVA; I will come back to CVA. For me FVA has a strong analogy to cross-currency, to multi-currency models—at least if you have the same rate for borrowing and lending. Each issuer has its own currency. So what is his currency? His currency is the bond he's issuing. Everything has to be denominated in his own interest rate, his funding rate. There are even instruments on the market which profit from this arbitrage between two banks which have different funding. These are the total return swaps where one bank with poor funding goes to another bank with good funding and they exchange funding and they both profit from this deal. I mean, the market for total return swaps is currently dead because funding is for free, but these things existed. I have a little paper with my colleague Mark Lichtner on this (see [6]).

This currency analogy: we had this in multi-currencies for years. We know how to value instruments in different currencies, and we have the same phenomena in currencies. For example, the cross-currency swap exchanges a floating rate in one currency for a floating rate in another currency. From the theory, this should be zero: both are floaters which are at par, but cross-currency swaps trade at a premium. There is a cross-currency basis spread. The reason is that there is a preference in the market, that one likes to finance oneself in U.S. Dollars and not in Euros (or vice versa), so, for example, a Euro bank would prefer to go to Euro financing instead of U.S. Dollar financing. I believe that FVA is something very natural. Also in mathematical theory it has been there in this currency analogy since, and it should be recognised inside the valuation because we wouldn't value Euro derivatives using the U.S. Dollar curve, would we?

One more word to CVA. If I'm provocative, I would say, like Damiano already pointed out, counterparty risk isn't something new. We had a defaultable LIBOR market model years ago, and counterparty risk was used years ago maybe only for credit derivatives, but it's not so new, and what is actually new here is that we suddenly have to look at netting. So the big change for me in this valuation adjustment topic is that we are talking about portfolio effects. What Damiano said this morning: the sum of each individual product valuation doesn't give you the value of the portfolio. So you have portfolio effects, you have to value everything in a single huge valuation framework, but if you define all the products of a bank as a portfolio, as one single product—I believe that the theory to be able to do this is actually to some extent known—the big problem is how do you implement numerically what you do on the computational side. For me this is the main motivation for these valuation adjustments. It is because we have computational problems, and we like to decompose the valuations into valuations for which we can sum up the products.

Going back to FVA, I do not understand why many people still use the risk-free interest rate as the basis for this valuation, for your reference valuation—because, first of all, I don't believe there's such a thing as a risk-free interest rate; it's just a misnomer. And wouldn't it be better to keep the adjustments as small as possible such that the price which you calculate is already as close as possible to the true price? So, for example, my navigation system in the car tells me, from Frankfurt to Munich you need four hours and thirty minutes. Okay, when I drive you need five hours and thirty minutes, but it gives me a good proxy. The proxy is using the average information available.

So coming back to Damiano's talk, maybe we should simplify things. I like to have things simplified, and my question is how can you simplify things such that you can implement them in a bank. For example, we can simplify and say that treasury uses an average funding rate which is in the middle of the bid-offer, and we use that rate to calculate the funding costs so that we have symmetry there and so on.

Finally, I would like to have just one desk where nonlinear effects are managed. We could have this set-up, so the question is how can we have this set-up in a bank. We could have this set-up if we have internal transactions in the bank, and these transactions are fully collateralised. So we have these linear traders who trade collateralised transactions with this nonlinear trading desk, and the nonlinear trading desk has the residual.

My conclusion is that I would like to have one formula or one model which gives me the true price, and then we can set up internal transactions, but what is the good way to set up these internal transactions such that we can implement this in a bank? This is my concern.

*Audience member:* Talking about implementation in the bank: What can you implement? Where is banking nowadays? CVA, we have all the data for CVA, I assume. No clue on wrong-way risk on these correlations you need and you already think about FVA and adjustments on adjustments but still didn't manage to find a decent proxy for wrong-way risk? The question is, are we looking and are we solving the right problems? What is your impression?

*Christian:* The data is actually the critical thing here. We can include more and more effects in a nonlinear trading valuation framework by improving the model—for example like the approaches we have seen here including wrong-way risk, copulas, whatever, but the problem is that we actually do not have the data to calibrate the model.

For example, going back to John's talk this morning, I have a little comment here: you'll see the effect of this multi-curve switch from LIBOR to OIS, but in this calculation there is an assumption. The assumption is that the swap, which is LIBOR-collateralised, so we use LIBOR discounting, trades at the same swap rate as the swap rate that is OIS-collateralised, so we use OIS discounting, so if you have the same rates for the swap, you get different forward rates. That's what we saw this morning.

The problem is you do not observe the swap rate for a LIBOR-collateralised swap. So it could even be that the swap rates are different and the forwards are the same. If we value, for example, an uncollateralised product, we do not even know what the correct forward rate is because we would need the uncollateralised swap to calibrate this forward rate. Data already start at the very beginning. The problem is data.

*Ralf:* Do you agree, Damiano?

*Damiano:* I talked to one of the CVA traders at a top tier 1 bank. They told me they have what they call zero-order risks in mind more than cross-gamma hedging. What they don't have for many counterparties is a healthy default-probability curve because there's no liquidity in the relevant CDS, so maybe they have a product with the airport of Duckburg, and this airport hasn't issued a liquid bond and there is no CDS. Where do you obtain the default probability? From the rating? But that's a physical measure, not a risk-neutral measure. And then the wrong-way correlations: you should use market-implied correlations because you are pricing, but then, where do you get them? It's almost impossible to get them for many assets, and also, finally, I would say that with CVA, you're right—we talk about KVA, but CVA is still very much a problem—and there is what I call payout risk, so depending on which closeout you use, and whether you include the first default check or not (some banks don't, because by avoiding it you avoid credit correlation, which is a bad beast in many cases), so depending on the type of CVA formula you implement—you have five, six different definitions of CVA—and that is payout risk. With old-style exotics, you had a very clear description of the payout, then you implemented the dynamics; you would get a price and hedge, and that would change with the model, and that would be model risk. Now with CVA we have payout risk. We don't even know which payout we are trading exactly, unless we have a very precise description of the CVA calculation.

But it's not like when you ask another bank, *What CVA charge are you applying to me?*, they tell you *It's a first-to-default inclusive, risk-free closeout …* They don't tell you that. *… And I'm using this kind of CDS curve.* Sometimes they don't tell you that, and you don't know.

*Ralf:* Daniel, do you have the same experience?

*Daniel:* Absolutely. I think even as many banks are talking about FVA these days, I think CVA is still an unresolved topic, and our observation is that even in a small market like the German market, there are a lot of different approaches taken by the banks to calculate CVA. The problem is becoming more difficult by the minute as the observable CDS prices, or tradable and liquid CDS prices get fewer and fewer. So this is an issue that gets more complicated by the minute.

And then another observation: we had a talk about wrong-way risk this morning, and we learned about the difficulties that this involves, and not surprisingly it's our observation that many banks are far from including wrong-way risk in their CVA calculations, so there's a long way to go before even CVA is settled.

*Ralf:* Okay, thank you very much, Daniel.

*John:* Maybe I should just respond to the point that Christian made about my presentation this morning. My swap rates were all fully collateralised swap rates, which would today reflect OIS discounting. I think Wolfgang [Rungaldier] called them the clean rates. As soon as you look at the uncollateralised market, any rates you see are contaminated by CVA and DVA.

You say, *Use LIBOR discounting.* I would say the correct thing to do even with uncollateralised transactions is still to use OIS discounting and calculate your CVA and DVA using spreads relative to the OIS curve. Forget about the LIBOR curve. The LIBOR curve is no longer appropriate for valuing derivatives. It could by chance be that LIBOR is the correct borrowing rate for the counterparty you're dealing with, but in most cases the borrowing rate of an uncollateralised end user is different from LIBOR, so LIBOR is not a relevant rate. I don't care whether we call the OIS rate the risk-free rate or not, but it is the best close-to-risk-free benchmark that we have.

*Ralf:* Thank you, John. It's now your turn, so please continue with your statement.

# **4 John Hull**

*John:* Hard to know where to start because I have written quite a bit on FVA in the last few years. I've actually consciously decided to stop doing it because I realise I could spend the whole of the rest of my academic career writing about this, and I'd never convince most people.

Actually, my interest in FVA has got an interesting history. In the middle of 2012, I got a call from the editor of Risk magazine saying, *We're bringing out the 25th anniversary edition of Risk magazine. We'd like you to write an article for it.* I agreed to write the article. (No academic ever says no to writing an article.) I asked *What would you like me to write about?* He said, *We don't mind what you write about, so long as it's interesting to our readership. But, by the way, we need the article in three weeks.*

I went down the corridor to discuss this with my colleague Alan White. We had a number of interesting ideas for the article. After two and a half weeks we settled on FVA. The trouble was that we then had only three days to write the article. In retrospect, I wish we'd had longer. So what did that article say? That article said, you should not make an FVA adjustment. I'll explain why in a minute. The reaction to the article was interesting. Usually when you write these articles, nothing much happens. You get maybe a little bit of a response from a few other academics. But in this case we were absolutely inundated with emails from people about this article. Two-thirds of emails were saying *You're crazy. You don't know what you're talking about. Clearly there should be an FVA adjustment. We've been doing for a while now …* and so on.

The other one-third were a little bit more positive, and some of them even went so far as to say, *We're glad someone's finally said this because we were a little uncomfortable with this FVA adjustment.* And, of course, Risk magazine realised that this was an exciting topic for them, so they started organising conferences on FVA.

Two people from Royal Bank of Scotland wrote a rejoinder to our article, which appeared in the next issue of Risk. And we were invited to write a rejoinder to the rejoinder, and so it went on. It was a really crazy time.

What I very quickly found out was that: Alan and I had a different perspective from most of the people we were corresponding with on this, and the reason was that we've been trained in finance. We've moved from finance into derivatives, and most of the people we were talking to had moved from physics or mathematics into derivatives. One important idea in corporate finance is that when you're valuing an investment, the discount rate should be determined by the riskiness of the investment. How you finance the investments is not important. Whether you finance it with debt or equity, it's the riskiness of the investment that matters. In other words, you should separate out the funding from the valuation of the investment (Fig. 3).

That was where we were coming from. In the case of derivatives a complication is that we can use risk-neutral valuation, so we've got a nice way of doing the valuation, but that does not alter the basic argument. Expected cash flows that are directly related

**Fig. 3** John Hull giving his presentation on "OIS discounting, interest rate derivatives, and the modeling of stochastic interest rate spreads"

to the investment should be taken into account. In the case of derivative transactions these expected cash flows include CVA and DVA.

So that's where we were coming from. We've modified our opinion a little bit recently. I think I'm more or less in the same camp as Damiano here, judging by his presentation. Let's suppose that you fund at OIS plus 200 basis points. If the whole of the 200 basis points is compensation for default risk, then you are actually getting a benefit from that 200 basis points, in that that 200 basis points is reflecting the losses to the lender (and benefits to you) of a possible default on your borrowings. That is what we call DVA 2, and what Damiano called DVA(F), and other people have called it FDA. This is not what we usually think of as DVA. What we usually think of as DVA is the fact that as a bank you might default on your derivatives, and that could be a gain to you. Here we are applying the same idea to the bank's debt.

DVA 2 cancels out FVA, and that was the main argument we made in that Risk magazine article. But if you say that the bank's borrowing rate is OIS plus 200 basis points where 120 basis points is for default risk, and 80 basis points is for other things—maybe liquidity—we can argue that 80 basis points is a dead-weight cost. It's part of the cost of doing business, you're not getting any benefit from that 80 basis points. You are getting benefit from the 120 basis points: a DVA-type benefit because you can default on your funding.

So I think I am in the same camp as Damiano. I think he called it LVA. This component of your funding cost which is not related to default risk, is arguably a genuine FVA. The problem is, of course, that it's very, very difficult to separate out the bit of your funding cost that's due to default risk and the bit of your funding cost that's due to other things.

And then another complication is, of course, that accountants assume—for example when calculating CVA—the whole of your credit spread reflects default risk.

I have lots and lots of discussions with people on this. You realise very quickly that you're never going to convince somebody who's in a different mindset from yourself on this. One important question, though, is what are we trying to do here? With these sorts of adjustments, are we trying to calculate a price we should charge a customer? (Obviously in this day and age, we would be talking about the price we should charge an end user because transactions with other banks are going to be fully collateralised.) Or are we concerned with internal accounting? Or is it financial accounting that is our objective? I've always taken the view that what we're really talking about here is what we record in our books as the value of this derivative. But if you take the view that what we're trying to do is to work out what we should charge an end user, a customer, then actually I have no problems doing whatever you like, even trying to convince a customer that the customer should pay an ECA, an electricity cost adjustment. We all know that what you're trying to do is get the best price you can and hopefully cover your costs.

What I found was when I was talking to people about FVA is you start talking about how derivatives should be accounted for and very quickly you slip into talking about how much the customer should be charged, which is a totally different issue. Obviously, there's all sorts of costs you've got to recover in terms of what you charge the customer.

Where are accountants coming from? As you all know, accountants want you to value derivatives at exit prices. The accounting bodies are quite clear, that the exit prices have nothing to do with your own costs. Exit prices should be related to what's going on in the marketplace. Therefore, your own funding costs can't possibly come into an exit price. If other dealers are using FVA in their pricing, their funding costs may be relevant, but your own funding costs are not relevant. An interesting question is how should we determine exit prices in a world where all dealers are incorporating FVA into their pricing. Should we build into our exit price an average of the funding costs of all dealers or the funding cost of the dealer that gives the most competitive price? You can argue about this, but it is difficult to argue that it is your own funding costs that should be used in accounting.

What we have found is there's a lot of confusion between DVA and FVA, and as I said there's really two distinct parts to DVA. There's the DVA associated with the fact that you may default on your derivatives. That's what we call DVA 1. It's the usual DVA. Your DVA 1 is your counterparty's CVA and vice versa. And then there's what we call DVA 2, which is the fact that you might default on your funding.

Banks have always been uncomfortable with DVA. Even though accounting bodies have approved DVA they dislike the idea of taking their own default risk into account. This has led some banks to replace DVA by FVA. In this context, FVA is sometimes divided into a funding benefit adjustment and a funding cost adjustment with the funding benefit adjustment being regarded as a substitute for DVA.

When you look at what's actually going on right now, banks are all over the place in terms of how they make funding value adjustments. I agree with Damiano that once JP Morgan announced that it is taking account of FVA, then everybody felt they had to do it as well. The correctness of FVA becomes a self-fulfilling prophecy. A bank's auditors are going to say, *Everybody else is doing this? Why aren't you doing it?* Whether or not you believe the models used by everyone else are correct, you have got to use those models to determine accounting values.

You can have research models for trading, but for accounting you've just got to do what everybody else does. When a critical mass of people move over to doing something, whether it's right or wrong, you've got to do it.

I notice from a recent article in Risk that the Basel committee is getting interested in funding value adjustments. And U.S. regulators are getting interested in funding value adjustments as well. In addition, I can tell you that a few months ago, AlanWhite and I were invited to FASB to talk to them about funding value adjustments. They have concerns about the use of FVA in accounting. They like derivatives accounting valuations to be based on market prices not on internal costs.

I think we are in a fairly fluid situation here. When JP Morgan has said, *We're doing it this way, and we're taking a one-and-half billion dollar hit* it is tempting to believe that everyone else will follow suit and that is the end of the story. I don't think it is the end of the story because we have not yet heard from accountants and regulators. Also, I think it is fair to say that the views of banks and the quants that work for them are evolving.

There's some good news. (Maybe it's not good news if you're a quant working for a bank.) The good news is that we're clearly moving to a world where all derivatives are fully collateralised. We're now in a situation where if you deal with another financial institution or another systemically important entity, you've got to be fully collateralised. Dealing with an end user, you don't have to be fully collateralised. But there's a lot of arguments (we talked about some of them at this conference) suggesting that end users will get a better deal if they are fully collateralised.

FVA is not going to be such a big issue going forward. Indeed, I think it's going to fade away as full collateralisation becomes the norm. But no doubt arguments about some other XVAs will continue.

*Ralf:* Thank you, John. I take away that for PhD students it is wise not to pursue too much research on FVA, then, it might not be worth the effort …

*Audience member:* Sorry, just if you'll allow me a little comment. Since the issue of the self-fulfilling prophecy was picked up also by John Hull, just a little comment from a mathematical point of view. If you do mathematics for the application, you need a model. Possibly a true model. So what is a true model? Now, if you do applications for the natural or physical sciences, possibly there is a true model. It is very complicated, and what you do, you choose a model that is a good compromise between representativity and tractability, right, so you can deal with this model and it's still relatively good.

Now we come to social / economic sciences. What is the true model? If, at some point, the majority sort of implicitly uses a sort of model, isn't that all of a sudden the true model that other people should follow, or am I wrong here?

*Damiano:* Like base correlation, for example?

*John:* Yes, I don't see it quite that way, though. I think opinions will fluctuate through time. Nearly all large global banks do make funding value adjustments now. There are two or three holdouts, but most of them do.

I think FVA is going to be more of a fad than a truth. I think that in five years' time we could be in the opposite position to today: everybody just decides they don't want to make these funding value adjustments. That's just my own personal opinion.

One thing I meant to say is that there are interactions between CVA and DVA. If one side defaults, you don't care about the other side defaulting later, and there are a number of other close-out issues. I agree with what Damiano says. Those create a lot of complications. And they are relevant because those are complications in assessing expected cash flows arising from the derivatives portfolio that you have with a counterparty. They're nothing to do with funding. They're to do with expected future cash flows, which are the relevant things to calculate a valuation. It does make the valuation more complicated, but to overlay that with funding adjustments I don't think is correct except insofar as some part of the funding value adjustment is the dead-weight cost I was talking about.

*Ralf:* Thank you, John. You mentioned valuation, so maybe this is the keyword to hand over to Daniel.

# **5 Daniel Sommer**

*Daniel:* First, John, as you immediately addressed the accounting profession, I'm not an accountant but I work for a firm that does audit and accounting as some part of its business. Are our accountants just people who tell the banks to do what everybody else does? The story is slightly more complicated than that because what accountants are interested in, and I pick up this story about self-fulfilling prophecies, what they are interested in eventually is fair value. And, indeed, for financial instruments that's defined as the exit price. But then the big question is: How do you find out what the exit price actually is?

Because it's not like for all the instruments that we're talking about in this seminar here, it's not something that you can read on Bloomberg or any other data provider. It's nothing that people will tell you in the street immediately. It's rather a complicated exercise to find out what fair value actually is. What would be the exit price at which you could actually exit your position? It's at that point where that whole reasoning comes up with the notion of how other people are thinking about valuing a certain position. How are my counterparties, my potential counterparties in the market, thinking about it? And that gives a bit more sense to the statement *Do what everybody else does*. Because if everybody else is taking certain aspects of a financial instrument into consideration when valuing this asset, it's very likely that your exit price that you are offered will also take that into consideration. It's for that reason that accountants are interested in what everybody else is doing, and frankly speaking, yes, at KPMG, that was indeed the discussion we had with many banks over the last three/four years where we met the banks in London on various panels to discuss FVA with them. Those were quite open discussions. From one year to the other, we sort of made a roll call and asked who's going to do what next year and when do you think you will be moving to FVA, etc., just to get a feeling for where the market was going in order to have a better understanding of what the market thought fair value would be. In that sense, I think that gives a bit more meaning to accountants telling the banks to do what everybody else is doing.

Now, coming to the current situation, indeed, I think there is no major bank globally left who has not declared they were doing something on funding valuation adjustments, with a lot of banks having come up with that in their 2014 year-end accounts. So I think the pressure on those banks who have not yet done that is actually rising. That's something which I think is a matter of fact.

I'm happy to comment or give my personal opinion about FVA, and perhaps talk about it by going back to some anecdotal evidence which I came across during the financial crisis. Before that, let me just mention a few more things.

Indeed the regulators become interested in FVA, and I think that there are at least two big issues that will have real effects on the banks that will enter the regulatory discussion or should enter the regulatory discussion. One thing is, indeed, the overlap between FVA and DVA, where many banks are happy to scrap DVA to a certain extent and replace it with FVA because that will have an immediate effect on their available regulatory capital. Because as they do the calculations these days, they offset FVA benefits and FVA costs. Thus to reduce DVA, where they need to deduct DVA from core Tier 1 capital, has a real effect on the bank's balance sheets and profitability calculations regarding regulatory capital.

The other thing people mentioned and it is true: hedging FVA just as is the case with CVA is a complicated issue and involves also hedging the related market risk. And so the question that we have been debating for CVA for a long time already is whether you are allowed to include the market risk hedges in your internal model for market risk or not. We've seen some movements in this direction recently by the regulators, but I think that those are two questions that at least should be quite prominent in the regulatory debate coming up.

That's one thing. The other thing is related to accounting. People quite leisurely mentioned that, well, yes, we need to go from a single deal valuation to portfolio valuation. And indeed for CVA that's absolutely inevitable. If you do that, nevertheless, for an accountant that raises a few uncomfortable questions because it raises the question: What is actually the unit of account? Apparently it's not a single deal. It may be the netting set as far as CVA is concerned, but when you look at funding, the netting set may even be too small, so it may be some sort of funding set, so all the deals that you have in one currency or so. When you look at effects on the balance sheet, do you need to value your whole bank before you can actually value your derivatives correctly? That's a bit of an uncomfortable direction we're going into.

Those are a few comments on things that people have said up to now, but on FVA itself, let me give you a little anecdote that occurred to me during the financial crisis. During the financial crisis, the CFO and CEO of one of our top-ten German banks asked me: *Look, all the banks have to reduce the values of their ABS and CDO books. Actually, don't you think that if a book is match-funded, it should be worth more than if a book is not match-funded?* And this goes back to the real fundamental question of liquidity risk and whether liquidity should play a role in pricing. And everybody who's read Modigliani and Miller, would say, *By no means.* That would be the standard answer. Nevertheless, when you come to think about the situation that the banks were in during the financial crisis, actually having a match-funded book gave you at least the option to wait. And there's real value in that option, as the banks who were able to wait were able to realise this because much of the write-downs that happened during the financial crisis actually came back as defaults were not as heavy as would have been thought at the time and indicated through the quotes at the peak of the crisis. It wasn't even traded prices at the time; it was basically quotes that banks were valuing their books on.

One might think—a very personal view at this point—one might think that if banks go for match-funding their books, it's like buying a very, very deep out-ofthe-money option that they can then exercise when things get really bad. So that's one comment I would like to make.

The other point is somewhat more disconcerting. What does being liquid mean in a world that has had the experience of the financial crisis? Is it sufficient to say that a bank is liquid if it can generate enough funds through the collateralised inter-bank money market? Or does a bank have to have access to sufficient central bank money to prove that it is liquid? At least the experience of the financial crisis showed the vulnerability of the inter-bank market and the importance of central bank money to keep the system afloat. In that case at least part of the liquidity costs of banks would be due to ensuring it has enough central bank money or assets that can swiftly be turned into the latter. But if that was so then this would change our whole valuation paradigm, which after all is based on the general equilibrium theory and the theory of value by Gérard Debreu and others. In this theory there is no need for a central bank to keep the system working. Therefore, acknowledging the existence of funding costs through the introduction of FVA may have far reaching consequences on the derivatives pricing theory compared to just the calculation of some odd valuation adjustment and quarreling about which funding curve to use to determine an exit price.

*Ralf:* Thank you, Daniel. John, do you want to comment on this? Is Miller and Modigliani still valid in such an environment?

*John:* Well, I think it is, but what Modigliani and Miller say is that if you cut the pie up, the sum of the pieces is worth the same as the whole. Now, the question is, who are the potential stakeholders you've got to look at when you cut the pie up.

I agree with pretty much everything that Daniel said. It makes a lot of sense.

*Ralf:* Christian?

*Christian:* I have a question maybe from the practitioner's side, also being a little bit of a quant with respect to the exit price, which keeps me puzzling. Just to make that clear, for me there are two prices at least. The exit price, I can realise it only once: by going out of business. There's only one opportunity to realise the exit price. There is, of course, the price which I use in calculating my risk sensitivities, my hedge, which I use in solving my optimal control problem, in my risk management problem.

So, for example, if the exit price would include a tax, there would be some kind of going-out-of-business tax, the exit price would clearly include this tax, but of course as long as I'd like to stay in business I would never charge that tax, and I would not include it in my hedging because it would never occur to me.

What is strange for me is that I believe that the good price for doing the optimal control problem, so how do you hedge and so on, is actually the price which is going to concern and not the exit price, but the balance sheet is using the exit price, and it appears to me as if management is always looking at the balance sheet. Isn't there some kind of contradictions? What is the price that should be used to find the optimal path for the company? To make the investment decisions and so on?

*Daniel:* First of all, it's very clear that what the accounting standards mean by exit price is by no means the price at which the bank would go out of business. It's a going concern still. Of course it's an artificial concept in the sense that you will never … even if you were to sell just a portfolio of your trading book, you would probably not realise what accountants think of as the fair value because they explicitly rule out including portfolio effects on this fair value.

What this exit price actually means is, two people meet in the market and they agree on a certain price at which to exchange a position without changing the market equilibrium, it has to be small relative to the market.

*Christian:* For example, for my own bonds, the exit price is my bond value, which obviously includes my funding, and for uncollateralised derivatives it is the derivative valued with some average market funding, and if I take your example of fully matched funding, this is puzzling me because the bonds are on funding and the uncollateralised derivatives are not on funding.

*Ralf:* I think this goes in the same direction as my question to Damiano about the close-out value—what value to use. I think we probably will not solve this puzzle today. Looking at the time, I would like to thank all of you for your attention. Thank you very much to all panelists, and I suppose there's plenty of time for further discussions during the dinner tonight. Thank you!

# *5.1 Acknowledgements, Credits, and Disclaimer*

All statements made in this panel discussion represent the personal views of the participants. Photographs of Damiano Brigo and John Hull by Astrid Eckert; photograph of the panel by Bettina Haas. The transcription of audio tape to text was made by Robin Black. For help with the manuscript we thank Florian Zyprian.

**Acknowledgements** The KPMG Center of Excellence in Risk Management is acknowledged for organizing the conference "Challenges in Derivatives Markets - Fixed Income Modeling, Valuation Adjustments, Risk Management, and Regulation".

**Open Access** This chapter is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, duplication, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, a link is provided to the Creative Commons license and any changes made are indicated.

The images or other third party material in this book are included in the work's Creative Commons license, unless indicated otherwise in the credit line; if such material is not included in the work's Creative Commons license and the respective action is not permitted by statutory regulation, users will need to obtain permission from the license holder to duplicate, adapt or reproduce the material.

# **References**


# **Part II Fixed Income Modeling**

# **Multi-curve Modelling Using Trees**

**John Hull and Alan White**

**Abstract** Since 2008 the valuation of derivatives has evolved so that OIS discounting rather than LIBOR discounting is used. Payoffs from interest rate derivatives usually depend on LIBOR. This means that the valuation of interest rate derivatives depends on the evolution of two different term structures. The spread between OIS and LIBOR rates is often assumed to be constant or deterministic. This paper explores how this assumption can be relaxed. It shows how well-established methods used to represent one-factor interest rate models in the form of a binomial or trinomial tree can be extended so that the OIS rate and a LIBOR rate are jointly modelled in a threedimensional tree. The procedures are illustrated with the valuation of spread options and Bermudan swap options. The tree is constructed so that LIBOR swap rates are matched.

**Keywords** OIS · LIBOR · Interest rate trees · Multi-curve modelling

# **1 Introduction**

Before the 2008 credit crisis, the spread between a LIBOR rate and the corresponding OIS (overnight indexed swap) rate was typically around 10 basis points. During the crisis this spread rose dramatically. This led practitioners to review their derivatives valuation procedures. A result of this review was a switch from LIBOR discounting to OIS discounting.

Finance theory argues that derivatives can be correctly valued by estimating expected cash flows in a risk-neutral world and discounting them at the risk-free rate. The OIS rate is a better proxy for the risk-free rate than LIBOR.<sup>1</sup> Another argument

J. Hull (B) · A. White

<sup>1</sup>See for example Hull and White [15].

Joseph L. Rotman School of Management, University of Toronto, Toronto, ON, Canada e-mail: hull@rotman.utoronto.ca

A. White e-mail: awhite@rotman.utoronto.ca

(appealing to many practitioners) in favor of using the OIS rate for discounting is that the interest paid on cash collateral is usually the overnight interbank rate and OIS rates are longer term rates derived from these overnight rates. The use of OIS rates therefore reflects funding costs.

Many interest rate derivatives provide payoffs dependent on LIBOR. When LI-BOR discounting was used, only one rate needed to be modelled to value these derivatives. Now that OIS discounting is used, more than one rate has to be considered. The spread between OIS and LIBOR rates is often assumed to be constant or deterministic. This paper provides a way of relaxing this assumption. It describes a way in which LIBOR with a particular tenor and OIS can be modelled using a three-dimensional tree.<sup>2</sup> It is an extension of ideas in the many papers that have been written on how one-factor interest rate models can be represented in the form of a two-dimensional tree. These papers include Ho and Lee [9], Black, Derman, and Toy [3], Black and Karasinski [4], Kalotay, Williams, and Fabozzi [18], Hainaut and MacGilchrist [8], and Hull and White [11, 13, 14, 16].

The balance of the paper is organized as follows. We first describe how LIBOR-OIS spreads have evolved through time. Second, we describe how a three-dimensional tree can be constructed to model both OIS rates and the LIBOR-OIS spread with a particular tenor. We then illustrate the tree-building process using a simple threestep tree. We investigate the convergence of the three-dimensional tree by using it to calculate the value of options on the LIBOR-OIS spread. We then value Bermudan swap options showing that in a low-interest-rate environment, the assumption that the spread is stochastic rather than deterministic can have a non-trivial effect on valuations.

# **2 The LIBOR-OIS Spread**

LIBOR quotes for maturities of one-, three-, six-, and 12-months in a variety of currencies are produced every day by the British Bankers' Association based on submissions from a panel of contributing banks. These are estimates of the unsecured rates at which AA-rated banks can borrow from other banks. The *T* -month OIS rate is the fixed rate paid on a *T* -month overnight interest rate swap. In such a swap the payment at the end of *T* -months is the difference between the fixed rate and a rate which is the geometric mean of daily overnight rates. The calculation of the payment on the floating side is designed to replicate the aggregate interest that would be earned from rolling over a sequence of daily loans at the overnight rate. (In U.S. dollars, the overnight rate used is the effective federal funds rate.) The LIBOR-OIS spread is the LIBOR rate less the corresponding OIS rate.

<sup>2</sup>At the end of Hull and White [17] we described an attempt to do this using a two-dimensional tree. The current procedure is better. Our earlier procedure only provides an approximate answer because the correlation between spreads at adjacent tree nodes is not fully modelled.

LIBOR-OIS spreads were markedly different during the pre-crisis (December 2001–July 2007) and post-crisis (July 2009–April 2015) periods. This is illustrated in Fig. 1. In the pre-crisis period, the spread term structure was quite flat with the 12 month spread only about 4 basis points higher than the one-month spread on average. As shown in Fig. 1a, the 12-month spread was sometimes higher and sometimes lower than one-month spread. The average one-month spread was about 10 basis points during this period. Because the term structure of spreads was on average fairly flat and quite small, it was plausible for practitioners to assume the existence of a single LIBOR zero curve and use it as a proxy for the risk-free zero curve. During the postcrisis period there has been a marked term structure of spreads. As shown in Fig. 1b, it is almost always the case that the spread curve is upward sloping. The average one-month spread continues to be about 10 basis points, but the average 12-month spread is about 62 basis points.

There are two factors that explain the difference between LIBOR rates and OIS rates. The first of these may be institutional. If a regression model is used to extrapolate the spread curve for shorter maturities, we find the one-day spread in the post-crisis period is estimated to be about 5 basis points. This is consistent with the spread between one-day LIBOR and the effective fed funds rate. Since these are both rates that a bank would pay to borrow money for 24 h, they should be the same. The 5 basis point difference must be related to institutional practices that affect the two different markets.<sup>3</sup>

Given that institutional differences account for about 5 basis points of spread, the balance of the spread must be attributable to credit. OIS rates are based on a continually refreshed one-day rate whereas τ -maturity LIBOR is a continually refreshed τ -maturity rate.<sup>4</sup> The difference between τ -maturity LIBOR and τ -maturity OIS then reflects the degree to which the credit quality of the LIBOR borrower is expected to decline over τ years.5 In the pre-crisis period the expected decline in the borrower credit quality implied by the spreads was small but during the post-crisis period it has been much larger.

The average hazard rate over the life of a LIBOR loan with maturity τ is approximately

$$\overline{\lambda} = \frac{L(\mathfrak{r})}{1 - R}$$

where *L*(τ ) is the spread of LIBOR over the risk-free rate and *R* is the recovery rate in the event of default. Let *h* be the hazard rate for overnight loans to high quality financial institutions (those that can borrow at the effective fed funds rate). This will also be the average hazard rate associated with OIS rates.

<sup>3</sup>For a more detailed discussion of these issues see Hull and White [15].

<sup>4</sup>A continually refreshed τ -maturity rate is the rate realized when a loan is made to a party with a certain specified credit rating (usually assumed in this context to be AA) for time τ . At the end of the period a new τ -maturity loan is made to a possibly different party with the same specified credit rating. See Collin-Dufresne and Solnik [6].

<sup>5</sup>It is well established that for high quality borrowers the expected credit quality declines with the passage of time.

**Fig. 1 a** Excess of 12-month LIBOR-OIS spread over one-month LIBOR-OIS spread December 4, 2001–July 31, 2007 period (basis points). Data Source: Bloomberg. **b** Post-crisis LIBOR-OIS spread for different tenors (basis points). Data Source: Bloomberg

Define *L*∗(τ ) as the spread of LIBOR over OIS for a maturity of τ and *O*(τ ) as the spread of OIS over the risk-free rate for this maturity. Because *L*(τ ) = *L*∗(τ )+*O*(τ )

$$\overline{\lambda} = \frac{L^\*(\mathfrak{r}) + O(\mathfrak{r})}{1 - R} = h + \frac{L^\*(\mathfrak{r})}{1 - R}$$

This shows that when we model OIS and LIBOR we are effectively modelling OIS and the difference between the LIBOR hazard rate and the OIS hazard rate.

One of the results of the post-crisis spread term structure is that a single LIBOR zero curve no longer exists. LIBOR zero curves can be constructed from swap rates, but there is a different LIBOR zero curve for each tenor. This paper shows how OIS rates and a LIBOR rate with a particular tenor can be modelled jointly using a three-dimensional tree.<sup>6</sup>

# **3 The Methodology**

Suppose that we are interested in modelling OIS rates and the LIBOR rate with tenor of τ . (Values of τ commonly used are one month, three months, six months and 12 months.) Define *r* as the instantaneous OIS rate. We assume that some function of *r*, *x*(*r*), follows the process

$$\mathbf{dx} = \left[\theta(t) - a\_r \mathbf{x}\right] \mathbf{d}t + \sigma\_r \,\mathrm{d}z\_r \tag{1}$$

This is an Ornstein–Uhlenbeck process with a time-dependent reversion level. The function θ (*t*) is chosen to match the initial term structure of OIS rates; *ar* (≥0) is the reversion rate of *x*; σ*<sup>r</sup>* (>0) is the volatility of *r*; and d*zr* is a Wiener process.7

Define *s* as the spread between the LIBOR rate with tenor τ and the OIS rate with tenor τ (both rates being measured with a compounding frequency corresponding to the tenor). We assume that some function of *s*, *y*(*s*), follows the process:

$$\mathbf{d}\mathbf{y} = \left[\phi(t) - a\_s \mathbf{y}\right] \mathbf{d}t + \sigma\_s \,\mathrm{d}z\_s \tag{2}$$

This is also an Ornstein–Uhlenbeck process with a time-dependent reversion level. The function φ(*t*) is chosen to ensure that all LIBOR FRAs and swaps that can be entered into today have a value of zero; *as* (≥0) is the reversion rate of *y*; σ*<sup>s</sup>* (>0) is

<sup>6</sup>Extending the approach so that more than one LIBOR rate is modelled is not likely to be feasible as it would involve using backward induction in conjunction with a four (or more)-dimensional tree. In practice, multiple LIBOR rates are most likely to be needed for portfolios when credit and other valuation adjustments are calculated. Monte Carlo simulation is usually used in these situations.

<sup>7</sup>This model does not allow interest rates to become negative. Negative interest have been observed in some currencies (particularly the euro and Swiss franc). If −*e* is the assumed minimum interest rate, this model can be adjusted so that *x* = ln(*r* + *e*). The choice of *e* is somewhat arbitrary, but changes the assumptions made about the behavior of interest rates in a non-trivial way.

the volatility of *s*; and d*zs* is a Wiener process. The correlation between d*zr* and d*zs* will be denoted by ρ.

We will use a three-dimensional tree to model *x* and *y*. A tree is a discrete time, discrete space approximation of a continuous stochastic process for a variable. The tree is constructed so that the mean and standard deviation of the variable is matched over each time step. Results in Ames [1] show that in the limit the tree converges to the continuous time process. At each node of the tree, *r* and *s* can be calculated using the inverse of the functions *x* and *y*.

We will first outline a step-by-step approach to constructing the three-dimensional tree and then provide more details in the context of a numerical example in Sect. 4. 8 The steps in the construction of the tree are as follows:


$$\frac{1/P-1}{\mathfrak{r}}$$

3. Construct a trinomial tree for the process for the spread function, *y*, in Eq. (2) when the function φ(*t*) is set equal to zero and the initial value of *y* is set equal to

<sup>8</sup>Readers who have worked with interest rate trees will be able to follow our step-by-step approach. Other readers may prefer to follow the numerical example.

<sup>9</sup>See for example Brigo and Mercurio [5] or Hull [10].

<sup>10</sup>See for example Hull and White [14].

<sup>11</sup>The *r*-tree shows the evolution of the Δ*t*-maturity OIS rate. Since we are interested in modelling the τ -maturity LIBOR-OIS spread, it is necessary to determine the evolution of the τ -maturity OIS rate.

zero.<sup>12</sup> We will refer to this as the "preliminary tree". When interest rate trees are built, the expected value of the short rate at each time step is chosen so that the initial term structure is matched. The adjustment to the expected rate at time *t* is achieved by adding some constant, α*<sup>t</sup>* , to the value of *x* at each node at that step.<sup>13</sup> The expected value of the spread at each step of the spread tree that is eventually constructed will similarly be chosen to match forward LIBOR rates. The current preliminary tree is a first step toward the construction of the final spread tree.


$$\frac{F - (w + s)}{1 + w\tau}$$

The value of the FRA is calculated for all nodes at time *t* and the values are discounted back through the three-dimensional tree to find the present value.<sup>16</sup> As discussed in step 3, the expected spread (i.e., the amount by which nodes are shifted from their positions in the preliminary tree) is chosen so that this present value is zero.

<sup>12</sup>As in the case of the tree for the interest rate function, *x*, the method can be generalized to accommodate a variety of two-dimensional and three-dimensional tree-building procedures.

<sup>13</sup>This is equivalent to determining the time varying drift parameter, θ (*t*), that is consistent with the current term structure.

<sup>14</sup>A forward rate agreement (FRA) is one leg of a fixed for floating interest rate swap. Typically, the forward rates underlying some FRAs can be observed in the market. Others can be bootstrapped from the fixed rates exchanged in interest rate swaps.

<sup>15</sup>*F*, *w*, and *s* are expressed with a compounding period of τ .

<sup>16</sup>Calculations are simplified by calculating Arrow–Debreu prices, first at all nodes of the twodimensional OIS tree and then at all nodes of the three-dimensional tree. The latter can be calculated at the end of the fifth step as they do not depend on spread values. This is explained in more detail and illustrated numerically in Sect. 4.

# **4 A Simple Three-Step Example**

We now present a simple example to illustrate the implementation of our procedure. We assume that the LIBOR maturity of interest is 12 months (τ = 1). We assume that *x* = ln(*r*) with *x* following the process in Eq. (1). Similarly we assume that *y* = ln(*s*) with *y* following the process in Eq. (2). We assume that the initial OIS zero rates and 12 month LIBOR forward rates are those shown in Table 1. We will build a 1.5-year tree where the time step, Δ*t*, equals 0.5 years. We assume that the reversion rate and volatility parameters are as shown in Table 2.

As explained in Hull and White [11, 13] we first build a tree for *x* assuming that θ (*t*) = 0. We set the spacing of the *x* nodes, Δ*x*, equal to σ*<sup>r</sup>* <sup>√</sup>3Δ*<sup>t</sup>* <sup>=</sup> <sup>0</sup>.3062. Define node (*i*, *j*) as the node at time *i*Δ*t* for which *x* = *j*Δ*x*. (The middle node at each time has *j* = 0.) The normal branching process in the tree is from (*i*, *j*) to one of (*i* +1, *j* +1), (*i* +1, *j*), and (*i* +1, *j* −1). The transition probabilities to these three nodes are *pu*, *pm*, and *pd* and are chosen to match the mean and standard deviation


**Table 1** Percentage interest rates for the examples

The OIS zero rates are expressed with continuous compounding while all forward and forward spread rates are expressed with annual compounding. The OIS zero rates and LIBOR forward rates are exact. OIS zero rates and LIBOR forward rates for maturities other than those given are determined using linear interpolation. The rates in the final two columns are rounded values calculated from the given OIS zero rates and LIBOR forward rates


of changes in time Δ*t* <sup>17</sup>

$$\begin{aligned} p\_u &= \frac{1}{6} + \frac{1}{2} (a\_r^2 j^2 \Delta t^2 - a\_r j \Delta t) \\ p\_m &= \frac{2}{3} - a\_r^2 j^2 \Delta t^2 \\ p\_d &= \frac{1}{6} + \frac{1}{2} (a\_r^2 j^2 \Delta t^2 + a\_r j \Delta t) \end{aligned}$$

As soon as *j* > 0.184/(*ar*Δ*t*), the branching process is changed so that (*i*, *j*) leads to one of (*i* + 1, *j*), (*i* + 1, *j* − 1), and (*i* + 1, *j* − 2). The transition probabilities to these three nodes are

$$\begin{aligned} p\_u &= \frac{7}{6} + \frac{1}{2} (a\_r^2 j^2 \Delta t^2 - 3a\_r j \Delta t) \\ p\_m &= -\frac{1}{3} - a\_r^2 j^2 \Delta t^2 + 2a\_r j \Delta t \\ p\_d &= \frac{1}{6} + \frac{1}{2} (a\_r^2 j^2 \Delta t^2 - a\_r j \Delta t) \end{aligned}$$

Similarly, as soon as *j* < −0.184/(*ar*Δ*t*) the branching process is changed so that (*i*, *j*) leads to one of (*i* + 1, *j* + 2), (*i* + 1, *j* + 1), and (*i* + 1, *j*). The transition probabilities to these three nodes are

$$\begin{aligned} p\_{\mu} &= \frac{1}{6} + \frac{1}{2} (a\_r^2 j^2 \Delta t^2 + a\_r j \Delta t) \\ p\_{\mu} &= -\frac{1}{3} - a\_r^2 j^2 \Delta t^2 - 2a\_r j \Delta t \\ p\_d &= \frac{7}{6} + \frac{1}{2} (a\_r^2 j^2 \Delta t^2 + 3a\_r j \Delta t) \end{aligned}$$

We then use an iterative procedure to calculate in succession the amount that the *x*-nodes at each time step must be shifted, α0, αΔ*<sup>t</sup>* , α<sup>2</sup>Δ*<sup>t</sup>* , ... , so that the OIS term structure is matched. The first value, α0, is chosen so that the tree correctly prices a discount bond maturing Δ*t*. The second value, αΔ*<sup>t</sup>* , is chosen so that the tree correctly prices a discount bond maturing 2Δ*t*, and so on.

Arrow–Debreu prices facilitate the calculation. The Arrow–Debreu price for a node is the price of a security that pays off \$1 if the node is reached and zero otherwise. Define *Ai*,*<sup>j</sup>* as the Arrow–Debreu price for node (*i*, *j*) and define *ri*,*<sup>j</sup>* as the Δ*t*-maturity interest rate at node (*i*, *j*). The value of α*<sup>i</sup>*Δ*<sup>t</sup>* can be calculated using an iterative search procedure from the *Ai*,*<sup>j</sup>* and the price at time zero, *Pi*+1, of a bond maturing at time (*i* + 1)Δ*t* using

<sup>17</sup>See for example Hull ([10], p. 725).

$$P\_{i+1} = \sum\_{j} A\_{i,j} \exp(-r\_{i,j} \Delta t) \tag{3}$$

in conjunction with

$$r\_{i,j} = \exp(\alpha\_{i\Delta t} + j\Delta x) \tag{4}$$

where the summation in Eq. (3) is over all *j* at time *i*Δ*t*. The Arrow–Debreu prices can then be updated using

$$A\_{i+1,k} = \sum\_{j} A\_{i,j} p\_{j,k} \exp(-r\_{i,j} \Delta t) \tag{5}$$

where *p*(*j*, *k*) is the probability of branching from (*i*, *j*) to (*i* + 1, *k*), and the summation is over all *j* at time *i*Δ*t*. The Arrow–Debreu price at the base of the tree, *A*0,0, is one. From this α<sup>0</sup> can be calculated using Eqs. (3) and (4). The *A*1,*<sup>k</sup>* can then be calculated using Eqs. (4) and (5). After that αΔ*<sup>t</sup>* can be calculated using Eqs. (3) and (4), and so on.

It is then necessary to calculate the value of the 12-month OIS rate at each node (step 2 in the previous section). As the tree has six-month time steps, a two-period roll back is required in the case of our simple example. It is necessary to build a four-step tree. The value at the *j*th node at time 4Δ*t*(= 2) of a discount bond that pays \$1 at time 5Δ*t*(= 2.5) is exp(−*r*4,*<sup>j</sup>*Δ*t*).

Discounting these values back to time 3Δ*t*(= 1.5) gives the price of a one-year discount bond at each node at 3Δ*t* from which the bond's yield can be determined. This is repeated for a bond that pays \$1 at time 4Δ*t* resulting in the one-year yields at time 2Δ*t*, and so on. The tree constructed so far and the values calculated are shown in Fig. 2. 18

The next stage (step 3 in the previous section) is to construct a tree for the spread assuming that the expected future spread is zero (the preliminary tree). As in the case of the OIS tree, Δ*t* = 0.5 and Δ*y* = σ*<sup>s</sup>* <sup>√</sup>3Δ*<sup>t</sup>* <sup>=</sup> <sup>0</sup>.2449. The branching process and probabilities are calculated as for the OIS tree (with *ar* replaced by *as*).

A three-dimensional tree is then created (step 4 in the previous section) by combining the spread tree and the OIS tree assuming zero correlation. We denote the node at time *i*Δ*t* where *x* = *j*Δ*x* and *y* = *k*Δ*y* by node (*i*, *j*, *k*). Consider for example node (2, −2, 2). This corresponds to node (2, −2) in the OIS tree, node *I* in Fig. 2, and node (2, 2) in the spread tree. The probabilities for the OIS tree are *pu* = 0.0809, *pm* = 0.0583, *pd* = 0.8609 and the branching process is to nodes where *j* = 0, *j* = −1, and *j* = −2. The probabilities for the spread tree are *pu* = 0.1217, *pm* = 0.6567, *pd* = 0.2217 and the branching process is to nodes where *k* = 1, *k* = 2, and *k* = 3. Denote *puu* as the probability of the highest move in the OIS tree being combined with the highest move in the spread tree; *pum* as the probability of the highest move in the OIS tree being combined with the middle move in the spread tree; and so on. The probability, *puu* of moving from node (2, −2, 2) to

<sup>18</sup>More details on the construction of the tree can be found in Hull [10].


**Fig. 2** Tree for OIS rates in three-step example

node (3, 0, 3) is therefore 0.0809×0.1217 or 0.0098; the probability, *pum* of moving from node (2, −2, 2) to node (3, 0, 2) is 0.0809×0.6567 or 0.0531 and so on. These (unadjusted) branching probabilities at node (2, −2, 2) are shown in Table 4a.

The next stage (step 5 in the previous section) is to adjust the probabilities to build in correlation between the OIS rate and the spread (i.e., the correlation between d*zr* and d*zs*). As explained in Hull and White [12], probabilities are changed as indicated in Table 3. <sup>19</sup> This leaves the marginal distributions unchanged. The resulting adjusted probabilities at node (2, −2, 2) are shown in Table 4b. In the example we are currently considering the adjusted probabilities are never negative. In practice negative probabilities do occur, but disappear as Δ*t* tends zero. They tend to occur only on the edges of the tree where the non-standard branching process is used and do not interfere with convergence. Our approach when negative probabilities are encountered at a node is to change the correlation at that node to the greatest (positive or negative) correlation that is consistent with non-negative probabilities.

<sup>19</sup>The procedure described in Hull and White [12] applies to trinomial trees. For binomial trees the analogous procedure is to increase *puu* and *pdd* by ε while decreasing *pud* and *pdu* by ε where ε = ρ/4.


**Table 3** Adjustments to probabilities to reflect correlation in a three-dimensional trinomial tree

(*e* = ρ/36 where ρ is the correlation)

**Table 4** (a) The unadjusted branching probabilities at node (2, −2, 2). The probabilities on the edge of the table are the branching probabilities at node (2, −2) of the *r*-tree and (2, 2) of the *s*-tree. (b) The adjusted branching probabilities at node (2, −2, 2). The probabilities on the edge of the table are the branching probabilities at node (2, −2) of the *r*-tree and (2, 2) of the *s*-tree. The adjustment is based on a correlation of 0.05 so *e* = 0.00139


The tree constructed so far reflects actual OIS movements and artificial spread movements where the initial spread and expected future spread are zero. We are now in a position to calculate Arrow–Debreu prices for each node of the three-dimensional tree. These Arrow–Debreu prices remain the same when the positions of the spread nodes are changed because the Arrow–Debreu price for a node depends only on OIS rates and the probability of the node being reached. They are shown in Table 5.

The final stage involves shifting the position of the spread nodes so that the prices of all LIBOR FRAs with a fixed rate equal to the initial forward LIBOR rate are zero. An iterative procedure is used to calculate the adjustment to the values of *y*


**Table 5** Arrow–Debreu prices for simple three-step example

at each node at each time step, β0, βΔ*<sup>t</sup>* , β<sup>2</sup>Δ*<sup>t</sup>* , ... , so that the FRAs have a value of zero. Given that Arrow–Debreu prices have already been calculated this is a fairly straightforward search. When the α*<sup>j</sup>*Δ*<sup>t</sup>* are determined it is necessary to first consider *j* = 0, then *j* = 1, then *j* = 2, and so on because the α-value at a particular time depends on the α-values at earlier times. The β-values however are independent of each other and can be determined in any order, or as needed. In the case of our example, β<sup>0</sup> = −6.493, βΔ*<sup>t</sup>* = −6.459, β<sup>2</sup>Δ*<sup>t</sup>* = −6.426, β<sup>3</sup>Δ*<sup>t</sup>* = −6.395.

# **5 Valuation of a Spread Option**

To illustrate convergence, we use the tree to calculate the value of a European call option that pays off 100 times max(*s* − 0.002, 0) at time *T* where *s* is the spread. First, we let *T* = 1.5 years and use the three-step tree developed in the previous section. At the third step of the tree we calculate the spread at each node. The spread at node (3, *j*, *k*) is exp[φ(3Δ*t*) + *k*Δ*y*]. These values are shown in the second line of Table 6. Once the spread values have been determined the option payoffs, 100 times max(*s* − 0.002, 0), at each node are calculated. These values are shown in the rest of Table 6. The option value is found by multiplying each option payoff by the corresponding Arrow–Debreu price in Table 5 and summing the values. The resulting option value is 0.00670. Table 7 shows how, for a 1.5- and 5-year spread option, the value converges as the number of time steps per year is increased.


**Table 6** Spread and spread option payoff at time 1.5 years when spread option is evaluated using a three-step tree

**Table 7** Value of a European spread option paying off 100 times the greater of the spread less 0.002 and zero


The market data used to build the tree is given in Tables 1 and 2

**Table 8** Value of a five-year European spread option paying off 100 times the greater of the spread less 0.002 and zero


The market data used to build the tree are given in Tables 1 and 2 except that the volatility of the spread and the correlation between the spread and the OIS rate are as given in this table. The number of time steps is 32 per year

Table 8 shows how the spread option price is affected by the assumed correlation and the volatility of the spread. All of the input parameters are as given in Tables 1 and 2 except that correlations between −0.75 and 0.75, and spread volatilities between 0.05 and 0.25 are considered. As might be expected the spread option price is very sensitive to the spread volatility. However, it is not very sensitive to the correlation. The reason for this is that changing the correlation primarily affects the Arrow–Debreu prices and leaves the option payoffs almost unchanged. Increasing the correlation increases the Arrow–Debreu prices on one diagonal of the final nodes and decreases them on the other diagonal. For example, in the three-step tree used to evaluate the option, the Arrow–Debreu price for nodes (3, 2, 3) and (3, −2, −3) increase while those for nodes (3, −2, 3) and (3, 2, −3) decrease. Since the option payoffs at nodes (3, 2, 3) and (3, −2, 3) are the same, the changes on the Arrow– Debreu prices offset one another resulting in only a small correlation effect.

# **6 Bermudan Swap Option**

We now consider how the valuation of a Bermudan swap option is affected by a stochastic spread in a low-interest-rate environment such as that experienced in the years following 2009. Bermudan swap options are popular instruments where the holder has the right to enter into a particular swap on a number of different swap payment dates.

The valuation procedure involves rolling back through the tree calculating both the swap price and (where appropriate) the option price. The swap's value is set equal to zero at the nodes on the swap's maturity date. The value at earlier nodes is calculated by rolling back adding in the present value of the next payment on each reset date. The option's value is set equal to max(*S*, 0) where *S* is the swap value at the option's maturity. It is then set equal to max(*S*, *V* ) for nodes on exercise dates where *S* is the swap value and *V* is the value of the option given by the roll back procedure.

We assume an OIS term structure that increases linearly from 15 basis points at time zero to 250 basis points at time 10 years. The OIS zero rate for maturity *t* is therefore

$$0.0015 + \frac{0.0235t}{10}$$

The process followed by the instantaneous OIS rate was similar to that derived by Deguillaume, Rebonato and Pogodin [7], and Hull and White [16]. For short rates between 0 and 1.5 %, changes in the rate are assumed to be lognormal with a volatility of 100 %. Between 1.5 % and 6 % changes in the short rate are assumed to be normal with the standard deviation of rate moves in time <sup>Δ</sup>*<sup>t</sup>* being 0.015√Δ*t*. Above 6 % rate moves were assumed to be lognormal with volatility 25 %. This pattern of the short rate's variability is shown in Fig. 3.

The spread between the forward 12-month OIS and the forward 12-month LIBOR was assumed to be 50 basis points for all maturities. The process assumed for the 12-month LIBOR-OIS spread, *s*, is that used in the example in Sects. 4 and 5

$$\operatorname{dln}(\mathbf{s}) = a\_{\mathbf{s}} [\phi(t) - \ln(\mathbf{s})] + \sigma\_{\mathbf{s}} \operatorname{d}\mathbf{z}\_{\mathbf{s}}$$

**Fig. 3** Variability assumed for short OIS rate,*r*, in Bermudan swap option valuation. The standard deviation of the short rate in time Δ*t* is *s*(*r*) <sup>√</sup>Δ*<sup>t</sup>*

**Table 9** (a) Value in a low-interest rate environment, of a receive-fixed Bermudan swap option on a 5-year annual-pay swap where the notional principal is 100 and the option can be exercised at times 1, 2, and 3 years. The swap rate is 1.5%. (b) Value in a low-interest-rate environment of a received-fixed Bermudan swap option on a 10-year annual-pay swap where the notional principal is 100 and the option can be exercised at times 1, 2, 3, 4, and 5 years. The swap rate is 3.0%


A maximum likelihood analysis of data on the 12-month LIBOR-OIS spread over the 2012 to 2014 period indicates that the behavior of the spread can be approximately described by a high volatility in conjunction with a high reversion rate. We set *as* equal to 0.4 and considered values of σ*<sup>s</sup>* equal to 0.30, 0.50, and 0.70. A number of alternative correlations between the spread process and the OIS process were also considered. We find that correlation of about −0.1 between one month OIS and the 12-month LIBOR OIS spread is indicated by the data.<sup>20</sup>

We consider two cases:


Table 9a shows results for the 3 × 5 swap option. In this case, even when the correlation between the spread rate and the OIS rate is relatively small, a stochastic spread is liable to change the price by 5–10%. Table 9b shows results for the 5 × 10 swap option. In this case, the percentage impact of a stochastic spread is smaller. This is because the spread, as a proportion of the average of the relevant forward OIS rates, is lower. The results in both tables are based on 32 time steps per year. As the level of OIS rates increases the impact of a stochastic spread becomes smaller in both Table 9a, b.

Comparing Tables 8 and 9, we see that the correlation between the OIS rate and the spread has a much bigger effect on the valuation of a Bermudan swap option than on the valuation of a spread option. For a spread option we argued that option payoffs for high Arrow–Debreu prices tend to offset those for low Arrow–Debreu prices. This is not the case for a Bermudan swap option because the payoff depends on the LIBOR rate, which depends on the OIS rate as well as the spread.

# **7 Conclusions**

For investment grade companies it is well known that the hazard rate is an increasing function of time. This means that the credit spread applicable to borrowing by AArated banks from other banks is an increasing function of maturity. Since 2008, markets have recognized this with the result that the LIBOR-OIS spread has been an increasing function of tenor.

Since 2008, practitioners have also switched from LIBOR discounting to OIS discounting. This means that two zero curves have to be modelled when most interest rate derivatives are valued. Many practitioners assume that the relevant LIBOR-OIS spread is either constant or deterministic. Our research shows that this is liable to lead to inaccurate pricing, particularly in the current low interest rate environment.

The tree approach we have presented provides an alternative to Monte Carlo simulation for simultaneously modelling spreads and OIS rates. It can be regarded as

<sup>20</sup>Because of the way LIBOR is calculated, daily LIBOR changes can be less volatile than the corresponding daily OIS changes (particularly if the Fed is not targeting a particular overnight rate). In some circumstances, it may be appropriate to consider changes over periods longer than one day when estimating the correlation.

an extension of the explicit finite difference method and is particularly useful when American-style derivatives are valued. It avoids the need to use techniques such as those suggested by Longstaff and Schwartz [19] and Andersen (2000) for handling early exercise within a Monte Carlo simulation.

Implying all the model parameters from market data is not likely to be feasible. One reasonable approach is to use historical data to determine the spread process and its correlation with the OIS process so that only the parameters driving the OIS process are implied from the market. The model can then be used in the same way that two-dimensional tree models for LIBOR were used pre-crisis.

**Acknowledgements** We are grateful to the Global Risk Institute in Financial Services for funding this research.

The KPMG Center of Excellence in Risk Management is acknowledged for organizing the conference "Challenges in Derivatives Markets - Fixed Income Modeling, Valuation Adjustments, Risk Management, and Regulation".

**Open Access** This chapter is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, duplication, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, a link is provided to the Creative Commons license and any changes made are indicated.

The images or other third party material in this chapter are included in the work's Creative Commons license, unless indicated otherwise in the credit line; if such material is not included in the work's Creative Commons license and the respective action is not permitted by statutory regulation, users will need to obtain permission from the license holder to duplicate, adapt or reproduce the material.

# **References**


# **Derivative Pricing for a Multi-curve Extension of the Gaussian, Exponentially Quadratic Short Rate Model**

**Zorana Grbac, Laura Meneghello and Wolfgang J. Runggaldier**

**Abstract** The recent financial crisis has led to so-called multi-curve models for the term structure. Here we study a multi-curve extension of short rate models where, in addition to the short rate itself, we introduce short rate spreads. In particular, we consider a Gaussian factor model where the short rate and the spreads are second order polynomials of Gaussian factor processes. This leads to an exponentially quadratic model class that is less well known than the exponentially affine class. In the latter class the factors enter linearly and for positivity one considers square root factor processes. While the square root factors in the affine class have more involved distributions, in the quadratic class the factors remain Gaussian and this leads to various advantages, in particular for derivative pricing. After some preliminaries on martingale modeling in the multi-curve setup, we concentrate on pricing of linear and optional derivatives. For linear derivatives, we exhibit an adjustment factor that allows one to pass from pre-crisis single curve values to the corresponding post-crisis multi-curve values.

**Keywords** Multi-curve models · Short rate models · Short rate spreads · Gaussian exponentially quadratic models · Pricing of linear and optional interest rate derivatives · Riccati equations · Adjustment factors

Z. Grbac (B)

L. Meneghello · W.J. Runggaldier

Dipartimento di Matematica Pura ed Applicata, Università di Padova, Via Trieste 63, 35121 Padova, Italy e-mail: meneghello.laura@yahoo.com

W.J. Runggaldier e-mail: runggal@math.unipd.it

L. Meneghello Present affiliation: Gruppo Banco Popolare *Disclaimer: The views, thoughts and opinions expressed in this paper are those of the authors in their individual capacity and should not be attributed to Gruppo Banco Popolare or to the authors as representatives or employees of Gruppo Banco Popolare*.

Laboratoire de Probabilités et Modèles Aléatoires, Université Paris Diderot - Paris 7, Case 7012, 75205 Paris Cedex 13, France e-mail: grbac@math.univ-paris-diderot.fr

# **1 Introduction**

The recent financial crisis has heavily impacted the financial market and the fixed income markets in particular. Key features put forward by the crisis are counterparty and liquidity/funding risk. In interest rate derivatives the underlying rates are typically Libor/Euribor. These are determined by a panel of banks and thus reflect various risks in the interbank market, in particular counterparty and liquidity risk. The standard noarbitrage relations between Libor rates of different maturities have broken down and significant spreads have been observed between Libor rates of different tenors, as well as between Libor and OIS swap rates, where OIS stands for Overnight Indexed Swap. For more details on this issue see Eqs. (5)–(7) and the paragraph following them, as well as the paper by Bormetti et al. [1] and a corresponding version in this volume. This has led practitioners and academics alike to construct multi-curve models where future cash flows are generated through curves associated to the underlying rates (typically the Libor, one for each tenor structure), but are discounted by another curve.

For the pre-crisis single-curve setup various interest rate models have been proposed. Some of the standard model classes are: the short rate models; the instantaneous forward rate models in an Heath–Jarrow–Morton (HJM) setup; the market forward rate models (Libor market models). In this paper we consider a possible multi-curve extension of the short rate model class that, with respect to the other model classes, has in particular the advantage of leading more easily to a Markovian structure. Other multi-curve extensions of short rate models have appeared in the literature such as Kijima et al. [22], Kenyon [20], Filipovi´c and Trolle [14], Morino and Runggaldier [27]. The present paper considers an exponentially quadratic model, whereas the models in the mentioned papers concern mainly the exponentially affine framework, except for [22] in which the exponentially quadratic models are mentioned. More details on the difference between the exponentially affine and exponentially quadratic short rate models will be provided below.

Inspired by a credit risk analogy, but also by a common practice of deriving multi-curve quantities by adding a spread over the corresponding single-curve riskfree quantities, we shall consider, next to the short rate itself, a short rate spread to be added to the short rate, one for each possible tenor structure. Notice that these spreads are added from the outset.

To discuss the basic ideas in an as simple as possible way, we consider just a twocurve model, namely with one curve for discounting and one for generating future cash flows; in other words, we shall consider a single tenor structure. We shall thus concentrate on the short rate *rt* and a single short rate spread *st* and, for their dynamics, introduce a factor model. In the pre-crisis single-curve setting there are two basic factor model classes for the short rate: the exponentially affine and the exponentially quadratic model classes. Here we shall concentrate on the less common quadratic class with Gaussian factors. In the exponentially affine class where, to guarantee positivity of rates and spreads, one considers generally square root models for the factors, the distribution of the factors is χ2. In the exponentially quadratic class the factors have a more convenient Gaussian distribution.

The paper is structured as follows. In the preliminary Sect. 2 we mainly discuss issues related to martingale modeling. In Sect. 3 we introduce the multi-curve Gaussian, exponentially quadratic model class. In Sect. 4 we deal with pricing of linear interest rate derivatives and, finally, in Sect. 5 with nonlinear/optional interest rate derivatives.

# **2 Preliminaries**

# *2.1 Discount Curve and Collateralization*

In the presence of multiple curves, the choice of the curve for discounting the future cash flows, and a related choice of the numeraire for the standard martingale measure used for pricing, in other words, the question of absence of arbitrage, becomes nontrivial (see e.g. the discussion in Kijima and Muromachi [21]). To avoid issues of arbitrage, one should possibly have a common discount curve to be applied to all future cash flows independently of the tenor. A choice, which has been widely accepted and became practically standard, is given by the OIS-curve *T* -→ *p*(*t*, *T*) = *pOIS*(*t*, *T*) that can be stripped from OIS rates, namely the fair rates in an OIS. The arguments justifying this choice and which are typically evoked in practice, are the fact that the majority of the traded interest rate derivatives are nowadays being collateralized and the rate used for remuneration of the collateral is exactly the overnight rate, which is the rate the OIS are based on. Moreover, the overnight rate bears very little risk due to its short maturity and therefore can be considered relatively risk-free. In this context we also point out that prices, corresponding to fully collateralized transactions, are considered as clean prices (this terminology was first introduced by Crépey [6] and Crépey et al. [9]). Since collateralization is by now applied in the majority of cases, one may thus ignore counterparty and liquidity risk between individual parties when pricing interest rate derivatives, but cannot ignore the counterparty and liquidity risk in the interbank market as a whole. These risks are often jointly referred to as interbank risk and they are main drivers of the multicurve phenomenon, as documented in the literature (see e.g. Crépey and Douady [7], Filipovi´c and Trolle [14], and Gallitschke et al. [15]). We shall thus consider only clean valuation formulas, which take into account the multi-curve issue. Possible ways to account for counterparty risk and funding issues between individual counterparties in a contract are, among others, to follow a global valuation approach that leads to nonlinear derivative valuation (see Brigo et al. [3, 4] and other references therein, and in particular Pallavicini and Brigo [28] for a global valuation approach applied specifically to interest rate modeling), or to consider various valuation adjustments that are generally computed on top of the clean prices (see Crépey [6]). A fully nonlinear valuation is preferable, but is more difficult to achieve. On the other hand, valuation adjustments are more consolidated and also used in practice and this gives a further justification to still look for clean prices. Concerning the explicit role of collateral in the pricing of interest rate derivatives, we refer to the above-mentioned paper by Pallavicini and Brigo [28].

# *2.2 Martingale Measures*

The fundamental theorem of asset pricing links the economic principle of absence of arbitrage with the notion of a martingale measure. As it is well known, this is a measure, under which the traded asset prices, expressed in units of a same numeraire, are local martingales. Models for interest rate markets are typically incomplete so that absence of arbitrage admits many martingale measures. A common approach in interest rate modeling is to perform martingale modeling, namely to model the quantities of interest directly under a generic martingale measure; one has then to perform a calibration in order to single out the specific martingale measure of interest. The modeling under a martingale measure now imposes some conditions on the model and, in interest rate theory, a typical such condition is the Heath–Jarrow– Morton (HJM) drift condition.

Starting from the OIS bonds, we shall first derive a suitable numeraire and then consider as martingale measure a measure *Q* under which not only the OIS bonds, but also the FRA contracts seen as basic quantities in the bond market, are local martingales when expressed in units of the given numeraire. To this basic market one can then add various derivatives imposing that their prices, expressed in units of the numeraire, are local martingales under *Q*.

Having made the choice of the OIS curve *T* -→ *p*(*t*, *T*) as the discount curve, consider the instantaneous forward rates *<sup>f</sup>*(*t*, *<sup>T</sup>*) := − <sup>∂</sup> <sup>∂</sup>*<sup>T</sup>* log *p*(*t*, *T*) and let *rt* = *f*(*t*, *t*) be the corresponding short rate at the generic time *t*. Define the OIS bank account as

$$B\_t = \exp\left(\int\_0^t r\_s ds\right) \tag{1}$$

and, as usual, the standard martingale measure *Q* as the measure, equivalent to the physical measure *P*, that is associated to the bank account *Bt* as numeraire. Hence the arbitrage-free prices of all assets, discounted by *Bt*, have to be local martingales with respect to *Q*. For derivative pricing, among them also FRA pricing, it is often more convenient to use, equivalently, the forward measure *Q<sup>T</sup>* associated to the OIS bond *p*(*t*, *T*) as numeraire. The two measures *Q* and *Q<sup>T</sup>* are related by their Radon– Nikodym density process

$$\frac{d}{d}\frac{\mathcal{Q}^T}{\mathcal{Q}}\Big|\_{\mathcal{P}\_t} = \frac{p(t,T)}{B\_t p(0,T)} \qquad 0 \le t \le T. \tag{2}$$

As already mentioned, we shall follow the traditional martingale modeling, whereby the model dynamics are assigned under the martingale measure *Q*. This leads to defining the OIS bond prices according to

$$p(t, T) = E^{\mathcal{Q}} \left\{ \exp \left[ -\int\_{t}^{T} r\_{u} du \right] \mid \mathcal{P}\_{t} \right\} \tag{3}$$

after having specified the *Q*−dynamics of *r*.

Coming now to the FRA contracts, recall that they concern a forward rate agreement, established at a time *t* for a future interval [*T*, *T* + Δ], where at time *T* + Δ the interest corresponding to a floating rate is received in exchange for the interest corresponding to a fixed rate *R*. There exist various possible conventions concerning the timing of the payments. Here we choose payment in arrears, which in this case means at time *T* + Δ. Typically, the floating rate is given by the Libor rate and, having assumed payments in arrears, we also assume that the rate is fixed at the beginning of the interval of interest, here at *T*. Recall that for expository simplicity we had reduced ourselves to a two-curve setup involving just a single Libor for a given tenor Δ. The floating rate received at *T* + Δ is therefore the rate *L*(*T*; *T*, *T* + Δ), fixed at the inception time *T*. For a unitary notional, and using the (*T* + Δ)-forward measure *QT*+<sup>Δ</sup> as the pricing measure, the arbitrage-free price at *t* ≤ *T* of the FRA contract is then

$$P^{FR}(t; T, T + \Delta, R) = \Delta p(t, T + \Delta)E^{T + \Delta} \left\{ L(T; T, T + \Delta) - R \mid \mathcal{F}\_t \right\}, \quad (4)$$

where *ET*+<sup>Δ</sup> denotes the expectation with respect to the measure *QT*+<sup>Δ</sup>. From this expression it follows that the value of the fixed rate *R* that makes the contract fair at time *t* is given by

$$R\_t = E^{T+\Delta} \left\{ L(T; T, T+\Delta) \mid \mathcal{F}\_t \right\} := L(t; T, T+\Delta) \tag{5}$$

and we shall call *L*(*t*; *T*, *T* + Δ) the forward Libor rate. Note that *L*(·; *T*, *T* + Δ) is a *Q<sup>T</sup>*+<sup>Δ</sup>−martingale by construction.

In view of developing a model for *L*(*T*; *T*, *T* + Δ), recall that, by absence of arbitrage arguments, the classical discrete compounding forward rate at time *t* for the future time interval [*T*, *T* + Δ] is given by

$$F(t; T, T + \Delta) = \frac{1}{\Delta} \left( \frac{p(t, T)}{p(t, T + \Delta)} - 1 \right),$$

where *p*(*t*, *T*) represents here the price of a risk-free zero coupon bond. This expression can be justified also by the fact that it represents the fair fixed rate in a forward rate agreement, where the floating rate received at *T* + Δ is

$$F(T; T, T + \Delta) = \frac{1}{\Delta} \left( \frac{1}{p(T, T + \Delta)} - 1 \right) \tag{6}$$

and we have

$$F(t; T, T + \Delta) = E^{T + \Delta} \left\{ F(T; T, T + \Delta) \mid \mathcal{J}\_t \right\}.\tag{7}$$

This makes the forward rate coherent with the risk-free bond prices, where the latter represent the expectation of the market concerning the future value of money.

Before the financial crisis, *L*(*T*; *T*, *T* + Δ) was assumed to be equal to *F*(*T*; *T*, *T* + Δ), an assumption that allowed for various simplifications in the determination of derivative prices. After the crisis *L*(*T*; *T*, *T* + Δ) is no longer equal to *F*(*T*; *T*, *T* + Δ) and what one considers for *F*(*T*; *T*, *T* + Δ) is in fact the OIS discretely compounded rate, which is based on the OIS bonds, even though the OIS bonds are not necessarily equal to the risk-free bonds (see Sects. 1.3.1 and 1.3.2 of Grbac and Runggaldier [18] for more details on this issue). In particular, the Libor rate *L*(*T*; *T*, *T* + Δ) cannot be expressed by the right-hand side of (6). The fact that *L*(*T*; *T*, *T* + Δ) = *F*(*T*; *T*, *T* + Δ) implies by (5) and (7) that also *L*(*t*; *T*, *T* + Δ) = *F*(*t*; *T*, *T* + Δ) for all *t* ≤ *T* and this leads to a Libor-OIS spread *L*(*t*; *T*, *T* + Δ) − *F*(*t*; *T*, *T* + Δ).

Following some of the recent literature (see e.g. Kijima et al. [22], Crépey et al. [8], Filipovi´c and Trolle [14]), one possibility is now to keep the classical relationship (6) also for *L*(*T*; *T*, *T* + Δ)thereby replacing however the bonds *p*(*t*, *T*) by fictitious risky ones *p*¯(*t*, *T*) that are assumed to be affected by the same factors as the Libor rates. Such a bond can be seen as an average bond issued by a representative bank from the Libor group and it is therefore sometimes referred to in the literature as a Libor bond. This leads to

$$L(T; T, T + \Delta) = \frac{1}{\Delta} \left( \frac{1}{\bar{p}(T, T + \Delta)} - 1 \right). \tag{8}$$

Recall that, for simplicity of exposition, we consider a single Libor for a single tenor Δ and so also a single fictitious bond. In general, one has one Libor and one fictitious bond for each tenor, i.e. *L*Δ(*T*; *T*, *T* + Δ) and *p*¯Δ(*T*, *T* + Δ). Note that we shall model the bond prices *p*¯(*t*, *T*), for all *t* and *T* with *t* ≤ *T*, even though only the prices *p*¯(*T*, *T* + Δ), for all *T*, are needed in relation (8). Moreover, keeping in mind that the bonds *p*¯(*t*, *T*) are fictitious, they do not have to satisfy the boundary condition *p*¯(*T*, *T*) = 1, but we still assume this condition in order to simplify the modeling.

To derive a dynamic model for *L*(*t*; *T*, *T* + Δ), we may now derive a dynamic model for *p*¯(*t*, *T* + Δ), where we have to keep in mind that the latter is not a traded quantity. Inspired by a credit-risk analogy, but also by a common practice of deriving multi-curve quantities by adding a spread over the corresponding single-curve (riskfree) quantities, which in this case is the short rate *rt*, let us define then the Libor (risky) bond prices as

$$\bar{p}(t,T) = E^{\mathcal{Q}} \left\{ \exp \left[ -\int\_{t}^{T} (r\_u + s\_u) du \right] \mid \mathcal{F}\_t \right\},\tag{9}$$

with *st* representing the short rate spread. In case of default risk alone, *st* corresponds to the hazard rate/default intensity, but here it corresponds more generally to all the factors affecting the Libor rate, namely besides credit risk, also liquidity risk, etc. Notice also that the spread is introduced here from the outset. Having for simplicity considered a single tenor Δ and thus a single *p*¯(*t*, *T*), we shall also consider only a single spread *st*. In general, however, one has a spread *s*<sup>Δ</sup> *<sup>t</sup>* for each tenor Δ.

We need now a dynamical model for both *rt* and *st* and we shall define this model directly under the martingale measure *Q* (martingale modeling).

# **3 Short Rate Model**

# *3.1 The Model*

As mentioned, we shall consider a dynamical model for *rt* and the single spread *st* under the martingale measure *Q* that, in practice, has to be calibrated to the market. For this purpose we shall consider a factor model with several factors driving *rt* and *st*.

The two basic factor model classes for the short rate in the pre-crisis single-curve setup, namely the exponentially affine and the exponentially quadratic model classes, both allow for flexibility and analytical tractability and this in turn allows for closed or semi-closed formulas for linear and optional interest rate derivatives. The former class is usually better known than the latter, but the latter has its own advantages. In fact, for the exponentially affine class one would consider*rt* and *st* as given by a linear combination of the factors and so, in order to obtain positivity, one has to consider a square root model for the factors. On the other hand, in the Gaussian exponentially quadratic class, one considers mean reverting Gaussian factor models, but at least some of the factors in the linear combination for *rt* and *st* appear as a square. In this way the distribution of the factors remains always Gaussian; in a square-root model it is a non-central χ<sup>2</sup>−distribution. Notice also that the exponentially quadratic models can be seen as dual to the square root exponentially affine models.

In the pre-crisis single-curve setting, the exponentially quadratic models have been considered, e.g. in El Karoui et al. [12], Pelsser [29], Gombani and Runggaldier [17], Leippold and Wu [24], Chen et al. [5], and Gaspar [16]. However, since the pre-crisis exponentially affine models are more common, there have also been more attempts to extend them to a post-crisis multi-curve setting (for an overview and details see e.g. Grbac and Runggaldier [18]). A first extension of exponentially quadratic models to a multi-curve setting can be found in Kijima et al. [22] and the present paper is devoted to a possibly full extension.

Let us now present the model for *rt* and *st*, where we consider not only the short rate *rt* itself, but also its spread *st* to be given by a linear combination of the factors, where at least some of the factors appear as a square. To keep the presentation simple, we shall consider a small number of factors and, in order to model also a possible correlation between *rt* and *st*, the minimal number of factors is three. It also follows from some of the econometric literature that a small number of factors may suffice to adequately model most situations (see also Duffee [10] and Duffie and Gârleanu [11]).

Given three independent affine factor processes Ψ*<sup>i</sup> <sup>t</sup>* , *i* = 1, 2, 3, having under *Q* the Gaussian dynamics

$$d\Psi\_t^i = -b^i \Psi\_t^i dt + \sigma^i d\psi\_t^i, \quad i = 1, 2, 3,\tag{10}$$

with *bi*, σ*<sup>i</sup>* > 0 and *w<sup>i</sup> <sup>t</sup>*, *i* = 1, 2, 3, independent *Q*−Wiener processes, we let

$$\begin{cases} r\_t = \Psi\_t^1 + (\Psi\_t^2)^2\\ s\_t = \kappa \Psi\_t^1 + (\Psi\_t^3)^2 \end{cases} \tag{11}$$

where Ψ<sup>1</sup> *<sup>t</sup>* is the common systematic factor allowing for instantaneous correlation between *rt* and *st* with correlation intensity κ and Ψ<sup>2</sup> *<sup>t</sup>* and Ψ<sup>3</sup> *<sup>t</sup>* are the idiosyncratic factors. Other factors may be added to drive *st*, but the minimal model containing common and idiosyncratic components requires three factors, as explained above. The common factor is particularly important because we want to take into account the realistic feature of non-zero correlation between *rt* and *st* in the model.

*Remark 3.1* The zero mean-reversion level is here considered only for convenience of simpler formulas, but can be easily taken to be positive, so that short rates and spreads can become negative only with small probability (see Kijima and Muromachi [21] for an alternative representation of the spreads in terms of Gaussian factors that guarantee the spreads to remain nonnegative and still allows for correlation between *rt* and *st*). Note, however, that given the current market situation where the observed interest rates are very close to zero and sometimes also negative, even models with negative mean-reversion level have been considered, as well as models allowing for regime-switching in the mean reversion parameter.

*Remark 3.2* For the short rate itself one could also consider the model *rt* = φ*<sup>t</sup>* + Ψ<sup>1</sup> *<sup>t</sup>* + (Ψ<sup>2</sup> *<sup>t</sup>* )<sup>2</sup> where φ*<sup>t</sup>* is a deterministic shift extension (see Brigo and Mercurio [2]) that allows for a good fit to the initial term structure in short rate models even with constant model parameters.

In the model (11) we have included a linear term Ψ<sup>1</sup> *<sup>t</sup>* which may lead to negative values of rates and spreads, although only with small probability in the case of models of the type (10) with a positive mean reversion level. The advantage of including this linear term is more generality and flexibility in the model. Moreover, it allows to express *p*¯(*t*, *T*) in terms of *p*(*t*, *T*) multiplied by a factor. This property will lead to an adjustment factor by which one can express post-crisis quantities in terms of corresponding pre-crisis quantities, see Morino and Runggaldier [27] in which this idea has been first proposed in the context of exponentially affine short rate models for multiple curves.

# *3.2 Bond Prices (OIS and Libor Bonds)*

In this subsection we derive explicit pricing formulas for the OIS bonds *p*(*t*, *T*) as defined in (3) and the fictitious Libor bonds *p*¯(*t*, *T*) as defined in (9). Thereby, *rt* and *st* are supposed to be given by (11) with the factor processes Ψ*<sup>i</sup> <sup>t</sup>* evolving under the standard martingale measure *Q* according to (10). Defining the matrices

$$F = \begin{bmatrix} -b^1 & 0 & 0 \\ 0 & -b^2 & 0 \\ 0 & 0 & -b^3 \end{bmatrix}, D = \begin{bmatrix} \sigma^1 & 0 & 0 \\ 0 & \sigma^2 & 0 \\ 0 & 0 & \sigma^3 \end{bmatrix} \tag{12}$$

and considering the vector factor process Ψ*<sup>t</sup>* := [Ψ<sup>1</sup> *<sup>t</sup>* , Ψ<sup>2</sup> *<sup>t</sup>* , Ψ<sup>3</sup> *t* ] as well as the multivariate Wiener process *Wt* := [*w*<sup>1</sup> *<sup>t</sup>* ,*w*<sup>2</sup> *<sup>t</sup>* ,*w*<sup>3</sup> *t* ] , where denotes transposition, the dynamics (10) can be rewritten in synthetic form as

$$d\Psi\_t = F\Psi\_t dt + DdW\_t.\tag{13}$$

Using results on exponential quadratic term structures (see Gombani and Runggaldier [17], Filipovi´c [13]), we have

$$\begin{split} p(t,T) &= E^{\mathcal{Q}} \left\{ e^{-\int\_{t}^{T} r\_{u} du} \bigg| \mathcal{R}\_{t} \right\} = E^{\mathcal{Q}} \left\{ e^{-\int\_{t}^{T} (\Psi\_{u}^{\prime} + (\Psi\_{u}^{\prime})^{2}) du} \bigg| \mathcal{R}\_{t} \right\} \\ &= \exp \left[ -A(t,T) - B^{\prime}(t,T)\Psi\_{t} - \Psi\_{t}^{\prime}C(t,T)\Psi\_{t} \right] \end{split} \tag{14}$$

and, setting *Rt* := *rt* + *st*,

$$\bar{p}(t,T) = E^{\mathcal{Q}} \left\{ e^{-\int\_{t}^{T} R\_{u} du} \Big| \mathcal{F}\_{l} \right\} = E^{\mathcal{Q}} \left\{ e^{-\int\_{t}^{T} ((1+\kappa)\Psi\_{u}^{1} + (\Psi\_{u}^{2})^{2} + (\Psi\_{u}^{3})^{2}) du} \Big| \mathcal{F}\_{l} \right\}$$

$$= \exp\Big[ -\bar{A}(t,T) - \bar{B}'(t,T)\Psi\_{t} - \Psi\_{l}'\bar{C}(t,T)\Psi\_{t} \Big], \tag{15}$$

where *A*(*t*, *T*), *A*¯(*t*, *T*), *B*(*t*, *T*), *B*¯(*t*, *T*), *C*(*t*, *T*) and *C*¯ (*t*, *T*) are scalar, vector, and matrix-valued deterministic functions to be determined.

For this purpose we recall the Heath–Jarrow–Morton (HJM) approach for the case when *p*(*t*, *T*) in (14) represents the price of a risk-free zero coupon bond. The HJM approach leads to the so-called HJM drift conditions that impose conditions on the coefficients in (14) so that the resulting prices *p*(*t*, *T*) do not imply arbitrage possibilities. Since the risk-free bonds are traded, the no-arbitrage condition is expressed by requiring *<sup>p</sup>*(*t*,*T*) *Bt* to be a *Q*−martingale for *Bt* defined as in (1) and it is exactly this martingality property to yield the drift condition. In our case, *p*(*t*, *T*) is the price of an OIS bond that is not necessarily traded and in general does not coincide with the price of a risk-free bond. However, whether the OIS bond is traded or not, *<sup>p</sup>*(*t*,*T*) *Bt* is a *Q*−martingale by the very definition of *p*(*t*, *T*) in (14) (see the first equality in (14)) and so we can follow the same HJM approach to obtain conditions on the coefficients in (14) also in our case.

For what concerns, on the other hand, the coefficients in (15), recall that *p*¯(*t*, *T*) is a fictitious asset that is not traded and thus is not subject to any no-arbitrage condition. Notice, however, that by analogy to *p*(*t*, *T*) in (14), by its very definition given in the first equality in (15), *<sup>p</sup>*¯(*t*,*T*) *B*¯*t* is a *<sup>Q</sup>*−martingale for *<sup>B</sup>*¯*<sup>t</sup>* given by *<sup>B</sup>*¯*<sup>t</sup>* := exp *<sup>t</sup>* <sup>0</sup> *Rudu*. The two cases *p*(*t*, *T*) and *p*¯(*t*, *T*) can thus be treated in complete analogy provided that we use for *p*¯(*t*, *T*) the numeraire *B*¯*t*.

We shall next derive from the *<sup>Q</sup>*−martingality of *<sup>p</sup>*(*t*,*T*) *Bt* and *<sup>p</sup>*¯(*t*,*T*) *B*¯*t* conditions on the coefficients in (14) and (15) that correspond to the classical HJM drift condition and lead thus to ODEs for these coefficients. For this purpose we shall proceed by analogy to Sect. 2 in [17], in particular to the proof of Proposition 2.1 therein, to which we also refer for more detail.

Introducing the "instantaneous forward rates" *<sup>f</sup>*(*t*, *<sup>T</sup>*) := − <sup>∂</sup> <sup>∂</sup>*<sup>T</sup>* log *p*(*t*, *T*) and ¯*f*(*t*, *<sup>T</sup>*) := − <sup>∂</sup> <sup>∂</sup>*<sup>T</sup>* log *p*¯(*t*, *T*), and setting

$$a(t,T) := \frac{\partial}{\partial T} A(t,T), \quad b(t,T) := \frac{\partial}{\partial T} B(t,T), \quad c(t,T) := \frac{\partial}{\partial T} C(t,T) \tag{16}$$

and analogously for *a*¯(*t*, *T*), *b*¯(*t*, *T*), *c*¯(*t*, *T*), from (14) and (15) we obtain

$$f(t,T) = a(t,T) + b'(t,T)\Psi\_t + \Psi\_t'c(t,T)\Psi\_t,\tag{17}$$

$$
\bar{f}(t,T) = \bar{a}(t,T) + \bar{b}'(t,T)\Psi\_t + \Psi\_t'\bar{c}(t,T)\Psi\_t. \tag{18}
$$

Recalling that *rt* = *f*(*t*, *t*) and *Rt* = ¯*f*(*t*, *t*), this implies, with *a*(*t*) := *a*(*t*, *t*), *b*(*t*) := *b*(*t*, *t*), *c*(*t*) := *c*(*t*, *t*) and analogously for the corresponding quantities with a bar, that

$$r\_l = a(t) + b'(t)\Psi\_l + \Psi\_t'c(t)\Psi\_t \tag{19}$$

and

$$R\_t = r\_t + s\_t = \bar{a}(t) + b'(t)\Psi\_t + \Psi\_t'\bar{c}(t)\Psi\_t. \tag{20}$$

Comparing (19) and (20) with (11), we obtain the following conditions where *i*, *j* = 1, 2, 3, namely

$$\begin{cases} a(t) = 0 \\ b^i(t) = \mathbf{1}\_{\{i=1\}} \\ c^{ij}(t) = \mathbf{1}\_{\{i=j=2\}} \end{cases} \quad \begin{cases} \bar{a}(t) = 0 \\ \bar{b}^i(t) = (1+\kappa)\mathbf{1}\_{\{i=1\}} \\ \bar{c}^{ij}(t) = \mathbf{1}\_{\{i=j=2\} \cup \{i=j=3\}} \end{cases}$$

Using next the fact that

$$p(t,T) = \exp\left[-\int\_{t}^{T} f(t,s)ds\right], \quad \bar{p}(t,T) = \exp\left[-\int\_{t}^{T} \bar{f}(t,s)ds\right], \dots$$

and imposing *<sup>p</sup>*(*t*,*T*) *Bt* and *<sup>p</sup>*¯(*t*,*T*) *B*¯*t* to be *Q*−martingales, one obtains ordinary differential equations to be satisfied by *c*(*t*, *T*), *b*(*t*, *T*), *a*(*t*, *T*) and analogously for the quantities with a bar. Integrating these ODEs with respect to the second variable and recalling (16) one obtains (for the details see the proof of Proposition 2.1 in [17])

$$\begin{cases} \mathcal{C}\_{l}(t,T) + 2FC(t,T) - 2C(t,T)DDC(t,T) + c(t) = 0, & \mathcal{C}(T,T) = 0\\ \bar{\mathcal{C}}\_{l}(t,T) + 2F\bar{\mathcal{C}}(t,T) - 2\bar{\mathcal{C}}(t,T)DD\bar{\mathcal{C}}(t,T) + \bar{c}(t) = 0, & \bar{\mathcal{C}}(T,T) = 0 \end{cases} (21)$$

with

$$c(t) = \begin{bmatrix} 0 \ 0 \ 0 \\ 0 \ 1 \ 0 \\ 0 \ 0 \ 0 \end{bmatrix} \quad \bar{c}(t) = \begin{bmatrix} 0 \ 0 \ 0 \\ 0 \ 1 \ 0 \\ 0 \ 0 \ 1 \end{bmatrix}.\tag{22}$$

The special forms of *F*, *D*, *c*(*t*) and *c*¯(*t*)together with boundary conditions*C*(*T*, *T*) = 0 and *C*¯ (*T*, *T*) = 0 imply that only *C*<sup>22</sup>,*C*¯ <sup>22</sup>,*C*¯ <sup>33</sup> are non-zero and satisfy

$$\begin{cases} \bar{C}\_t^{22}(t,T) - 2b^2 \bar{C}^{22}(t,T) - 2(\sigma^2)^2 (\bar{C}^{22}(t,T))^2 + 1 = 0, & \bar{C}^{22}(T,T) = 0\\ \bar{C}\_t^{22}(t,T) - 2b^2 \bar{C}^{22}(t,T) - 2(\sigma^2)^2 (\bar{C}^{22}(t,T))^2 + 1 = 0, & \bar{C}^{22}(T,T) = 0\\ \bar{C}\_t^{33}(t,T) - 2b^3 \bar{C}^{33}(t,T) - 2(\sigma^3)^2 (\bar{C}^{33}(t,T))^2 + 1 = 0, & \bar{C}^{33}(T,T) = 0 \end{cases} (23)$$

that can be shown to have as solution

$$\begin{cases} C^{22}(t,T) = \bar{C}^{22}(t,T) = \frac{2(e^{(T-t)h^2} - 1)}{2h^2 + (2b^2 + h^2)(e^{(T-t)h^2} - 1)} \\ \bar{C}^{33}(t,T) = \frac{2(e^{(T-t)h^3} - 1)}{2h^3 + (2b^3 + h^3)(e^{(T-t)h^3} - 1)} \end{cases} \tag{24}$$

with *h<sup>i</sup>* = 4(*bi* )<sup>2</sup> + 8(σ*<sup>i</sup>* )<sup>2</sup> > 0, *i* = 2, 3.

Next, always by analogy to the proof of Proposition 2.1 in [17], the vectors of coefficients *B*(*t*, *T*) and *B*¯(*t*, *T*) of the first order terms can be seen to satisfy the following system

$$\begin{cases} B\_l(t, T) + B(t, T)F - 2B(t, T)DDC(t, T) + b(t) = 0, & B(T, T) = 0 \\ \bar{B}\_l(t, T) + \bar{B}(t, T)F - 2\bar{B}(t, T)DD\bar{C}(t, T) + \bar{b}(t) = 0, & \bar{B}(T, T) = 0 \end{cases} \quad (25)$$

with

$$b(t) = [1, 0, 0] \quad b(t) = [(1 + \kappa), 0, 0].$$

Noticing similarly as above that only *B*<sup>1</sup>(*t*, *T*), *B*¯ <sup>1</sup>(*t*, *T*) are non-zero, system (25) becomes

$$\begin{cases} B\_t^1(t,T) - b^1 B^1(t,T) + 1 = 0 & B^1(T,T) = 0\\ \bar{B}\_t^1(t,T) - b^1 \bar{B}^1(t,T) + (1+\kappa) = 0 & \bar{B}^1(T,T) = 0 \end{cases} \tag{26}$$

leading to the explicit solution

$$\begin{cases} B^1(t,T) = \frac{1}{b^1} \left( 1 - e^{-b^1(T-t)} \right) \\ \bar{B}^1(t,T) = \frac{1+\kappa}{b^1} \left( 1 - e^{-b^1(T-t)} \right) = (1+\kappa)B^1(t,T). \end{cases} \tag{27}$$

Finally, *A*(*t*, *T*) and *A*¯(*t*, *T*) have to satisfy

$$\begin{cases} A\_t(t,T) + (\sigma^2)^2 \bar{C}^{22}(t,T) - \frac{1}{2} (\sigma^1)^2 (\bar{B}^1(t,T))^2 = 0, \\ \bar{A}\_t(t,T) + (\sigma^2)^2 \bar{C}^{22}(t,T) + (\sigma^3)^2 \bar{C}^{33}(t,T) - \frac{1}{2} (\sigma^1)^2 (\bar{B}^1(t,T))^2 = 0 \end{cases} \tag{28}$$

with boundary conditions *A*(*T*, *T*) = 0, *A*¯(*T*, *T*) = 0. The explicit expressions can be obtained simply by integrating the above equations.

Summarizing, we have proved the following:

**Proposition 3.1** *Assume that the OIS short rate r and the spread s are given by* (*11*) *with the factor processes* Ψ*<sup>i</sup> <sup>t</sup> , i* = 1, 2, 3*, evolving according to* (*10*) *under the standard martingale measure Q. The time-t price of the OIS bond p*(*t*, *T*)*, as defined in* (*3*)*, is given by*

$$p(t,T) = \exp[-A(t,T) - B^1(t,T)\Psi\_t^1 - C^{22}(t,T)(\Psi\_t^2)^2],\tag{29}$$

*and the time-t price of the fictitious Libor bond p*¯(*t*, *T*)*, as defined in (9), by*

$$\begin{split} \bar{p}(t,T) &= \exp[-\tilde{A}(t,T) - (\kappa + \mathrm{l})\boldsymbol{\mathcal{B}}^{\mathrm{l}}(t,T)\boldsymbol{\Psi}\_{\mathrm{l}}^{\mathrm{l}}] - \boldsymbol{\mathcal{C}}^{22}(t,T)(\boldsymbol{\Psi}\_{\mathrm{l}}^{\mathrm{2}})^{2} - \bar{\boldsymbol{\mathcal{C}}}^{33}(t,T)(\boldsymbol{\Psi}\_{\mathrm{l}}^{\mathrm{3}})^{2}] \\ &= p(t,T)\exp[-\tilde{A}(t,T) - \kappa \boldsymbol{\mathcal{B}}^{\mathrm{l}}(t,T)\boldsymbol{\Psi}\_{\mathrm{l}}^{\mathrm{1}} - \bar{\boldsymbol{\mathcal{C}}}^{33}(t,T)(\boldsymbol{\Psi}\_{\mathrm{l}}^{\mathrm{3}})^{2}], \end{split} \tag{30}$$

*where A*˜(*t*, *T*) := *A*¯(*t*, *T*) − *A*(*t*, *T*) *with A*(*t*, *T*) *and A*¯(*t*, *T*) *given by* (*28*)*, B*<sup>1</sup>(*t*, *T*) *given by* (*27*) *and C*<sup>22</sup>(*t*, *T*) *and C*<sup>33</sup>(*t*, *T*) *given by* (*24*)*.*

In particular, expression (30) gives *p*¯(*t*, *T*) in terms of *p*(*t*, *T*). Based on this we shall derive in the following section the announced adjustment factor allowing to pass from pre-crisis quantities to the corresponding post-crisis quantities.

# *3.3 Forward Measure*

The underlying factor model was defined in (10) under the standard martingale measure *Q*. For derivative prices, which we shall determine in the following two sections, it will be convenient to work under forward measures, for which, given the single tenor Δ, we shall consider a generic (*T* + Δ)-forward measure. The density process to change the measure from *Q* to *Q<sup>T</sup>*+<sup>Δ</sup> is

Derivative Pricing for a Multi-curve Extension … 203

$$\mathcal{QC}\_t := \frac{d\,\,\mathcal{Q}^{T+\Delta}}{d\,\,\mathcal{Q}}\Big|\_{\mathcal{F}\_t} = \frac{p(t, T+\Delta)}{p(0, T+\Delta)} \frac{1}{B\_t} \tag{31}$$

from which it follows by (29) and the martingale property of *<sup>p</sup>*(*t*,*T*+Δ) *Bt t*≤*T*+Δ that

$$d\mathcal{X}\_t^\rho = \mathcal{X}\_t^\rho \left( -\mathcal{B}^1(t, T+\Delta)\sigma^1 dw\_t^1 - 2C^{22}(t, T+\Delta)\Psi\_t^2 \sigma^2 dw\_t^2 \right).$$

This implies by Girsanov's theorem that

$$\begin{cases} dw\_t^{1,T+\Delta} = dw\_t^1 + \sigma^1 \mathcal{B}^1(t, T+\Delta) dt\\ dw\_t^{2,T+\Delta} = dw\_t^2 + 2\mathcal{C}^{22}(t, T+\Delta) \Psi\_t^2 \sigma^2 dt\\ dw\_t^{3,T+\Delta} = dw\_t^3 \end{cases} \tag{32}$$

are *QT*+<sup>Δ</sup>−Wiener processes. From the *Q*−dynamics (10) we then obtain the following *QT*+<sup>Δ</sup>−dynamics for the factors

$$\begin{cases} d\boldsymbol{\Psi}\_{t}^{1} = -\left[b^{1}\boldsymbol{\Psi}\_{t}^{1} + (\sigma^{1})^{2}\boldsymbol{\mathcal{B}}^{1}(t, T + \Delta)\right]dt + \sigma^{1}d\boldsymbol{w}\_{t}^{1, T + \Delta} \\ d\boldsymbol{\Psi}\_{t}^{2} = -\left[b^{2}\boldsymbol{\Psi}\_{t}^{2} + 2(\sigma^{2})^{2}\boldsymbol{C}^{22}(t, T + \Delta)\boldsymbol{\Psi}\_{t}^{2}\right]dt + \sigma^{2}d\boldsymbol{w}\_{t}^{2, T + \Delta} \\ d\boldsymbol{\Psi}\_{t}^{3} = -b^{3}\boldsymbol{\Psi}\_{t}^{3}dt + \sigma^{3}d\boldsymbol{w}\_{t}^{3, T + \Delta}. \end{cases} \tag{33}$$

*Remark 3.3* While in the dynamics (10) for Ψ*<sup>i</sup> <sup>t</sup>* , (*i* = 1, 2, 3) under *Q* we had for simplicity assumed a zero mean-reversion level, under the (*T* + Δ)-forward measure the mean-reversion level is for Ψ<sup>1</sup> *<sup>t</sup>* now different from zero due to the measure transformation.

**Lemma 3.1** *Analogously to the case when p*(*t*, *T*) *represents the price of a risk-free zero coupon bond, also for p*(*t*, *T*) *viewed as OIS bond we have that <sup>p</sup>*(*t*,*T*) *<sup>p</sup>*(*t*,*T*+Δ) *is a Q<sup>T</sup>*+<sup>Δ</sup>−*martingale.*

*Proof* We have seen that also for OIS bonds as defined in (3) we have that, with *Bt* as in (1), the ratio *<sup>p</sup>*(*t*,*T*) *Bt* is a *Q*−martingale. From Bayes' formula we then have

$$\begin{split} &E^{T+\Delta}\left\{\frac{p(\boldsymbol{T},\boldsymbol{T})}{p(\boldsymbol{T},\boldsymbol{I}+\boldsymbol{\Delta})} \mid \mathcal{F}\_{t}\right\} = \frac{E^{Q}\left\{\frac{1}{p(\boldsymbol{0},\boldsymbol{I}+\boldsymbol{\Delta})}\frac{1}{\mathcal{B}\_{\boldsymbol{T}+\boldsymbol{\Delta}}}\frac{p(\boldsymbol{T},\boldsymbol{T})}{p(\boldsymbol{T},\boldsymbol{I}+\boldsymbol{\Delta})} \mid \mathcal{F}\_{t}\right\}}{E^{Q}\left\{\frac{1}{p(\boldsymbol{0},\boldsymbol{T}+\boldsymbol{\Delta})}\frac{1}{\mathcal{B}\_{\boldsymbol{T}+\boldsymbol{\Delta}}}\mid \mathcal{F}\_{t}\right\}} \\ &= \frac{E^{Q}\left\{\frac{p(\boldsymbol{T},\boldsymbol{I})}{p(\boldsymbol{0},\boldsymbol{T}+\boldsymbol{\Delta})}E^{Q}\left\{\frac{1}{\mathcal{B}\_{\boldsymbol{T}+\boldsymbol{\Delta}}}\mid \mathcal{F}\_{t}\right\}\mid \mathcal{F}\_{t}\right\}}{\frac{p(\boldsymbol{0},\boldsymbol{T}+\boldsymbol{\Delta})}{\mathcal{B}\_{\boldsymbol{t}}}} = \frac{B\_{t}E^{Q}\left\{\frac{p(\boldsymbol{T},\boldsymbol{T})}{p(\boldsymbol{T},\boldsymbol{I}+\boldsymbol{\Delta})}\frac{p(\boldsymbol{T},\boldsymbol{T}+\boldsymbol{\Delta})}{\mathcal{B}\_{\boldsymbol{T}}}\mid \mathcal{F}\_{t}\right\}}{p(\boldsymbol{t},\boldsymbol{T}+\boldsymbol{\Delta})} \\ &= \frac{B\_{t}E^{Q}\left\{\frac{p(\boldsymbol{T},\boldsymbol{T})}{\mathcal{B}\_{\boldsymbol{T}}}\mid \mathcal{F}\_{t}\right\}}{p(\boldsymbol{t},\boldsymbol{T}+\boldsymbol{\Delta})} = \frac{p(\boldsymbol{t},\boldsymbol{T})}{p(\boldsymbol{t},\boldsymbol{T}+\boldsymbol{\Delta})}, \end{split}$$

thus proving the statement of the lemma. -

We recall that we denote the expectation with respect to the measure *Q<sup>T</sup>*+<sup>Δ</sup> by *E<sup>T</sup>*+<sup>Δ</sup>{·}. The dynamics in (33) lead to Gaussian distributions for Ψ*<sup>i</sup> <sup>t</sup>* , *i* = 1, 2, 3 that, given *B*1(·) and *C*22(·), have mean and variance

$$E^{T+\Delta} \{ \Psi^i\_t \} = \bar{\alpha}^i\_t = \bar{\alpha}^i\_t(b^i, \sigma^i) \quad , \quad Var^{T+\Delta} \{ \Psi^i\_t \} = \bar{\beta}^i\_t = \bar{\beta}^i\_t(b^i, \sigma^i),$$

which can be explicitly computed. More precisely, we have

$$\begin{cases} \bar{\alpha}\_{t}^{1} &= e^{-b^{\dagger}t} \Big[ \Psi\_{0}^{1} - \frac{(\sigma^{1})^{2}}{2(b^{\dagger})^{2}} e^{-b^{\dagger}(T+\Delta)} (1 - e^{2b^{\dagger}t}) - \frac{(\sigma^{1})^{2}}{(b^{\dagger})^{2}} (1 - e^{b^{\dagger}t}) \Big] \\ \bar{\beta}\_{t}^{1} &= e^{-2b^{\dagger}t} (e^{2b^{\dagger}t} - 1) \frac{(\sigma^{1})^{2}}{2(b^{\dagger})} \\ \bar{\alpha}\_{t}^{2} &= e^{-(b^{2}t + 2(\sigma^{2})^{2} \tilde{C}^{2} (t, T+\Delta))} \Psi\_{0}^{2} \\ \bar{\beta}\_{t}^{2} &= e^{-(2b^{\dagger}t + 4(\sigma^{2})^{2} \tilde{C}^{2} (t, T+\Delta))} \int\_{0}^{t} e^{2b^{\dagger}s + 4(\sigma^{2})^{2} \tilde{C}^{2} (s, T+\Delta)} (\sigma^{2})^{2} ds \\ \bar{\alpha}\_{t}^{3} &= e^{-b^{\dagger}t} \Psi\_{0}^{3} \\ \bar{\beta}\_{t}^{3} &= e^{-2b^{\dagger}t} \frac{(\sigma^{3})^{2}}{2b^{\dagger}} (e^{2b^{\dagger}t} - 1), \end{cases} (34)$$

with

$$\begin{split} \tilde{C}^{22}(t, T + \Delta) &= \frac{2(2\log(2b^2(e^{(T+\Delta-t)h^2} - 1) + h^2(e^{(T+\Delta-t)h^2} + 1)) + t(2b^2 + h^2))}{(2b^2 + h^2)(2b^2 - h^2)} \\ &- \frac{2(2\log(2b^2(e^{(T+\Delta)h^2} - 1) + h^2(e^{(T+\Delta)h^2} + 1))}{(2b^2 + h^2)(2b^2 - h^2)} \end{split} \tag{35}$$

and *h*<sup>2</sup> = (2*b*<sup>2</sup>)<sup>2</sup> + 8(σ<sup>2</sup>)2, and where we have assumed deterministic initial values Ψ<sup>1</sup> <sup>0</sup> , Ψ<sup>2</sup> <sup>0</sup> andΨ<sup>3</sup> <sup>0</sup> . For details of the above computation see the proof of Corollary 4.1.3. in Meneghello [25].

# **4 Pricing of Linear Interest Rate Derivatives**

We have discussed in Sect. 3.2 the pricing of OIS and Libor bonds in the Gaussian, exponentially quadratic short rate model introduced in Sect. 3.1. In the remaining part of the paper we shall be concerned with the pricing of interest rate derivatives, namely with derivatives having the Libor rate as underlying rate. In the present section we shall deal with the basic linear derivatives, namely FRAs and interest rate swaps, while nonlinear derivatives will then be dealt with in the following Sect. 5. For the FRA rates discussed in the next Sect. 4.1 we shall in Sect. 4.1.1 exhibit an adjustment factorallowing to pass from the single-curve FRA rate to the multi-curve FRA rate.

# *4.1 FRAs*

We start by recalling the definition of a standard forward rate agreement. We emphasize that we use a text-book definition which differs slightly from a market definition, see Mercurio [26].

**Definition 4.1** Given the time points 0 ≤ *t* ≤ *T* < *T* + Δ, a forward rate agreement (FRA) is an OTC derivative that allows the holder to lock in at the generic date *t* ≤ *T* the interest rate between the inception date *T* and the maturity *T* + Δ at a fixed value *R*. At maturity *T* + Δ a payment based on the interest rate *R*, applied to a notional amount of *N*, is made and the one based on the relevant floating rate (generally the spot Libor rate *L*(*T*; *T*, *T* + Δ))is received.

Recalling that for the Libor rate we had postulated the relation (8) to hold at the inception time *T*, namely

$$L(T; T, T + \Delta) = \frac{1}{\Delta} \left( \frac{1}{\bar{p}(T, T + \Delta)} - 1 \right),$$

the price, at *t* ≤ *T*, of the FRA with fixed rate *R* and notional *N* can be computed under the (*T* + Δ)-forward measure as

$$\begin{aligned} &P^{FRA}(t;T,T+\Delta,R,N) \\ &= N\Delta p(t,T+\Delta)E^{T+\Delta}\left\{L(T;T,T+\Delta)-R \mid \mathcal{F}\_t\right\} \\ &= Np(t,T+\Delta)E^{T+\Delta}\left\{\frac{1}{\bar{p}(T,T+\Delta)}-(1+\Delta R)\mid \mathcal{F}\_t\right\},\end{aligned} \tag{36}$$

Defining

$$\bar{\nu}\_{t,T} := E^{T+\Delta} \left\{ \frac{1}{\bar{p}(T, T+\Delta)} \mid \mathcal{F}\_t \right\}, \tag{37}$$

it is easily seen from (36) that the fair rate of the FRA, namely the FRA rate, is given by

$$
\bar{R}\_t = \frac{1}{\Delta} \left( \bar{\nu}\_{t,T} - 1 \right). \tag{38}
$$

In the single-curve casewe have instead

$$R\_t = \frac{1}{\Delta} \left(\nu\_{t,T} - 1\right),\tag{39}$$

where, given that *<sup>p</sup>*(·,*T*) *<sup>p</sup>*(·,*T*+Δ) is a *<sup>Q</sup><sup>T</sup>*+<sup>Δ</sup>−martingale (see Lemma 3.1),

$$\nu\_{t,T} := E^{T+\Delta} \left\{ \frac{1}{p(T, T+\Delta)} \mid \mathcal{F}\_t \right\} = \frac{p(t, T)}{p(t, T+\Delta)},\tag{40}$$

which is the classical expression for the FRA rate in the single-curve case. Notice that, contrary to (37), the expression in (40) can be explicitly computed on the basis of bond price data without requiring an interest rate model.

#### **4.1.1 Adjustment Factor**

We shall show here the following:

**Proposition 4.1** *We have the relationship*

$$
\bar{\nu}\_{t,T} = \nu\_{t,T} \cdot \mathrm{Ad}\_t^{T,\Delta} \cdot \mathrm{Res}\_t^{T,\Delta} \tag{41}
$$

*with*

$$\begin{split} \boldsymbol{Ad}\_{t}^{T,\Delta} := & \boldsymbol{E}^{\mathcal{Q}} \left\{ \frac{p(\boldsymbol{T}, \boldsymbol{T} + \boldsymbol{\Delta})}{\bar{p}(\boldsymbol{T}, \boldsymbol{T} + \boldsymbol{\Delta})} \mid \mathcal{F}\_{t} \right\} = \boldsymbol{E}^{\mathcal{Q}} \left\{ \exp \Big[ \tilde{\boldsymbol{A}}(\boldsymbol{T}, \boldsymbol{T} + \boldsymbol{\Delta}) \Big. \\ & \left. + \kappa \boldsymbol{\mathcal{B}}^{1}(\boldsymbol{T}, \boldsymbol{T} + \boldsymbol{\Delta}) \Psi\_{T}^{1} + \bar{\boldsymbol{C}}^{33}(\boldsymbol{T}, \boldsymbol{T} + \boldsymbol{\Delta}) (\Psi\_{T}^{\mathcal{J}})^{2} \right] \mid \mathcal{F}\_{t} \right\} \end{split} \tag{42}$$

*and*

$$Res\_t^{T, \Delta} = \exp\left[ -\kappa \frac{(\sigma^1)^2}{2(b^1)^3} \left( 1 - e^{-b^1 \Delta} \right) \left( 1 - e^{-b^1(T-t)} \right)^2 \right],\tag{43}$$

*where A*˜(*t*, *T*) *is defined after* (*30*)*, B*<sup>1</sup>(*t*, *T*) *in* (*27*) *and C*¯ <sup>33</sup>(*t*, *T*) *in* (*24*)*.*

*Proof* Firstly, from (30) we obtain

$$\frac{p(T, T + \Delta)}{\bar{p}(T, T + \Delta)} = e^{\tilde{A}(T, T + \Delta) + \kappa B^1(T, T + \Delta)\Psi\_T^1 + \bar{C}^3(T, T + \Delta)(\Psi\_T^3)^2}.\tag{44}$$

In (37) we now change back from the (*T* + Δ)-forward measure to the standard martingale measure using the density process *L<sup>t</sup>* given in (31). Using furthermore the above expression for the ratio of the OIS and the Libor bond prices and taking into account the definition of the short rate *rt* in terms of the factor processes, we obtain

$$\begin{split} \bar{\nu}\_{t,T} &= E^{T+\Delta} \left\{ \frac{1}{\bar{p}(T,T+\Delta)} \big| \mathcal{F}\_{t} \right\} = \mathcal{Q}\_{t}^{-1} E^{\mathcal{Q}} \left\{ \frac{\mathcal{Q}\_{T}}{\bar{p}(T,T+\Delta)} \big| \mathcal{F}\_{t} \right\} \\ &= \frac{1}{p(t,T+\Delta)} E^{\mathcal{Q}} \left\{ \exp \left( - \int\_{t}^{T} r\_{u} du \right) \frac{p(T,T+\Delta)}{\bar{p}(T,T+\Delta)} \big| \mathcal{F}\_{t} \right\} \\ &= \frac{1}{p(t,T+\Delta)} \exp[\tilde{A}(T,T+\Delta)] E^{\mathcal{Q}} \left\{ e^{\bar{C}^{\mathcal{Q}}(T,T+\Delta)(\Psi\_{T}^{\mathcal{P}})^{2}} \big| \mathcal{F}\_{t} \right\} \end{split}$$

Derivative Pricing for a Multi-curve Extension … 207

$$\begin{split} & \cdot E^{\mathcal{Q}} \left\{ e^{-\int\_{t}^{T} (\Psi\_{u}^{1} + (\Psi\_{u}^{2})^{2}) du} e^{\kappa B^{1}(T, T+\Delta)\Psi\_{T}^{1}} \Big| \mathcal{F}\_{t} \right\} \\ &= \frac{1}{p(t, T+\Delta)} \exp[\tilde{A}(T, T+\Delta)] E^{\mathcal{Q}} \left\{ e^{\tilde{C}^{\mathcal{Q}}(T, T+\Delta)(\Psi\_{T}^{3})^{2}} \Big| \mathcal{F}\_{t} \right\} \\ & \cdot E^{\mathcal{Q}} \left\{ e^{-\int\_{t}^{T} \Psi\_{u}^{1} du} e^{\kappa B^{1}(T, T+\Delta)\Psi\_{T}^{1}} \Big| \mathcal{F}\_{t} \right\} E^{\mathcal{Q}} \left\{ e^{-\int\_{t}^{T} (\Psi\_{u}^{2})^{2} du} \Big| \mathcal{F}\_{t} \right\}, \end{split} \tag{45}$$

where we have used the independence of the factors Ψ*<sup>i</sup>* , *i* = 1, 2, 3 under *Q*.

Recall now from the theory of affine processes (see e.g. Lemma 2.1 in Grbac and Runggaldier [18]) that, for a process Ψ<sup>1</sup> *<sup>t</sup>* satisfying (10), we have for all δ,*K* ∈ R

$$E^{\mathcal{Q}}\left\{\exp\left[-\int\_{t}^{T}\delta\Psi\_{\mu}^{\,1}du - K\Psi\_{T}^{\,1}\right]\mid\mathcal{J}\_{t}\right\} = \exp[\alpha^{1}(t,T) - \beta^{1}(t,T)\Psi\_{t}^{\,1}],\quad(46)$$

where

$$\begin{cases} \beta^{\mathbb{I}}(t,T) = \mathcal{K}e^{-b^{\mathbb{I}}(T-t)} - \frac{\delta}{b^{\mathbb{I}}} \left( e^{-b^{\mathbb{I}}(T-t)} - 1 \right), \\\alpha^{\mathbb{I}}(t,T) = \frac{(\sigma^{\mathbb{I}})^{\mathbb{I}}}{2} \int\_{t}^{T} (\beta^{\mathbb{I}}(u,T))^{2} du. \end{cases}$$

Setting *K* = −κ *B*<sup>1</sup>(*T*, *T* + Δ) and δ = 1, and recalling from (27) that *B*<sup>1</sup>(*t*, *T*) = 1 *b*1 <sup>1</sup> <sup>−</sup> *<sup>e</sup>*−*b*1(*T*−*t*) , this leads to

$$\begin{split} &E^{\mathcal{Q}} \left\{ e^{-\int\_{t}^{T} \Psi^{1}\_{u} du} e^{\kappa \mathcal{B}^{1}(T, T + \Delta) \Psi^{1}\_{T}} \Big| \mathcal{F}\_{t} \right\} \\ &= \exp\Big[ \frac{(\sigma^{1})^{2}}{2} (\kappa \mathcal{B}^{1}(T, T + \Delta))^{2} \int\_{t}^{T} e^{-2b^{\mathrm{l}}(T - u)} du \\ & \quad - \kappa \mathcal{B}^{1}(T, T + \Delta) (\sigma^{1})^{2} \int\_{t}^{T} B^{\mathrm{l}}(u, T) e^{-b^{\mathrm{l}}(T - u)} du + \frac{(\sigma^{1})^{2}}{2} \int\_{t}^{T} (\mathcal{B}^{1}(u, T))^{2} du \\ & \quad + \left( \kappa \mathcal{B}^{1}(T, T + \Delta) e^{-b^{\mathrm{l}}(T - t)} - \mathcal{B}^{1}(t, T) \right) \Psi^{1}\_{t} \Big]. \end{split} \tag{47}$$

On the other hand, from the results of Sect. 3.2 we also have that, for a process Ψ<sup>2</sup> *t* satisfying (10),

$$E^{\mathcal{Q}}\left\{\exp\left[-\int\_{t}^{T}(\Psi\_{u}^{2})^{2}d\mu\right]\mid\mathcal{J}\_{t}^{\mathcal{P}}\right\}=\exp\left[-\alpha^{2}(t,T)-C^{22}(t,T)(\Psi\_{t}^{2})^{2}\right],$$

where *C*<sup>22</sup>(*t*, *T*) corresponds to (24) and (see (28))

$$\alpha^2(t, T) = (\sigma^2)^2 \int\_t^T C^{22}(u, T) du.$$

This implies that

$$\begin{split} &E^{\mathcal{Q}} \left\{ \exp \left[ -\int\_{t}^{T} (\Psi\_{u}^{2})^{2} du \right] \mid \mathcal{F}\_{t} \right\} \\ &= \exp \left[ - (\sigma^{2})^{2} \int\_{t}^{T} \mathcal{C}^{22}(u, T) du - \mathcal{C}^{22}(t, T) \left(\Psi\_{t}^{2}\right)^{2} \right]. \end{split} \tag{48}$$

Replacing (47) and (48) into (45), and recalling the expression for *p*(*t*, *T*) in (29) with *A*(·), *B*<sup>1</sup>(·),*C*<sup>22</sup>(·) according to (28), (27) and (24) respectively, we obtain

$$\begin{split} \bar{\boldsymbol{\nu}}\_{t,T} &= \frac{p(t,T)}{p(t,T+\Delta)} e^{\tilde{\boldsymbol{\lambda}}(T,T+\Delta)} \boldsymbol{E} \boldsymbol{Q} \left[ e^{\tilde{\boldsymbol{\lambda}}^{33}(T,T+\Delta)(\boldsymbol{\Psi}\_{T}^{3})^{2}} \big| \boldsymbol{\mathcal{P}}\_{t} \right] \\ &\cdot \exp\left[ \frac{(\sigma^{1})^{2}}{2} (\kappa \boldsymbol{B}^{1}(T,T+\Delta))^{2} \int\_{t}^{T} e^{-2\boldsymbol{b}^{1}(T-\boldsymbol{u})} du + \kappa \boldsymbol{B}^{1}(T,T+\Delta) \boldsymbol{e}^{-\boldsymbol{b}^{1}(T-t)} \boldsymbol{\Psi}\_{t}^{1} \right] \\ &\cdot \exp\left[ -\kappa \boldsymbol{B}^{1}(T,T+\Delta)(\boldsymbol{\sigma}^{1})^{2} \int\_{t}^{T} \boldsymbol{B}^{1}(u,T) \boldsymbol{e}^{-\boldsymbol{b}^{1}(T-u)} du \right]. \end{split} \tag{49}$$

We recall the expression (44) for *<sup>p</sup>*(*T*,*T*+Δ) *<sup>p</sup>*¯(*T*,*T*+Δ) and the fact that, according to (46), we have

$$\begin{split} &E^{\mathcal{Q}} \left\{ e^{\kappa \mathcal{B}^{\mathrm{l}}(T,T+\Delta)\Psi^{\mathrm{l}}\_{T}} \Big| \mathcal{F}\_{\mathrm{l}} \right\} \\ &= \exp\left[ \frac{(\sigma^{\mathrm{l}})^{2}}{2} (\kappa \mathcal{B}^{\mathrm{l}}(T,T+\Delta))^{2} \int\_{I}^{T} e^{-2b^{\mathrm{l}}(T-u)} du + \kappa \mathcal{B}^{\mathrm{l}}(T,T+\Delta) e^{-b^{\mathrm{l}}(T-t)} \Psi^{\mathrm{l}}\_{I} \right]. \end{split}$$

Inserting these expressions into (49) we obtain the result, namely

$$\begin{split} \tilde{\nu}\_{t,T} &= \frac{p(t,T)}{p(t,T+\Delta)} E^{Q} \left[ \frac{p(T,T+\Delta)}{\bar{p}(T,T+\Delta)} \big| \mathcal{F}\_{t} \right] \\ &\quad \cdot \exp\Big[-\kappa B^{1}(T,T+\Delta)(\sigma^{1})^{2} \int\_{t}^{T} B^{1}(u,T)e^{-b^{\dagger}(T-u)} du \Big] \\ &= \frac{p(t,T)}{p(t,T+\Delta)} E^{Q} \left[ \frac{p(T,T+\Delta)}{\bar{p}(T,T+\Delta)} \big| \mathcal{F}\_{t} \right] \\ &\quad \cdot \exp\Big[-\frac{\kappa}{b^{\dagger}} (e^{-b^{\dagger}\Delta}-1)(\sigma^{1})^{2} \Big(\frac{1}{2(b^{\dagger})^{2}}(1-e^{-2b^{\dagger}(T-t)})-\frac{1}{(b^{\dagger})^{2}}(1-e^{-b^{\dagger}(T-t)})\Big) \Big]. \end{split} \tag{50}$$

where we have also used the fact that

$$\begin{aligned} \int\_t^T \mathbf{B}^l(u, T) e^{-b^l(T-u)} du &= \int\_t^T \frac{1}{b^l} \left( 1 - e^{-b^l(T-u)} \right) e^{-b^l(T-u)} du \\ &= -\frac{1}{2(b^l)^2} \left( 1 - e^{-2b^l(T-t)} \right) + \frac{1}{(b^l)^2} \left( 1 - e^{-b^l(T-t)} \right) . \end{aligned}$$

*Remark 4.1* The adjustment factor *Ad<sup>T</sup>*,Δ *<sup>t</sup>* allows for some intuitive interpretations. Here we mention only the easiest one for the case when κ = 0 (independence of *rt* and *st*). In this case we have *rt* + *st* > *rt* implying that *p*¯(*T*, *T* + Δ) < *p*(*T*, *T* + Δ) so that *Ad<sup>T</sup>*,Δ *<sup>t</sup>* ≥ 1. Furthermore, always for κ = 0, the residual factor has value *Res<sup>T</sup>*,Δ *<sup>t</sup>* = 1. All this in turn implies ν¯*<sup>t</sup>*,*<sup>T</sup>* ≥ ν*<sup>t</sup>*,*<sup>T</sup>* and with it *R*¯*<sup>t</sup>* ≥ *Rt*, which is what one would expect to be the case.

*Remark 4.2* (Calibration to the initial term structure). The parameters in the model (10) for the factors Ψ*<sup>i</sup> <sup>t</sup>* and thus also in the model (11) for the short rate *rt* and the spread *st* are the coefficients *b<sup>i</sup>* and σ*<sup>i</sup>* for *i* = 1, 2, 3. From (14) notice that, for *i* = 1, 2, these coefficients enter the expressions for the OIS bond prices *p*(*t*, *T*) that can be assumed to be observable since they can be bootstrapped from the market quotes for the OIS swap rates. We may thus assume that these coefficients, i.e. *b<sup>i</sup>* and σ*<sup>i</sup>* for *i* = 1, 2, can be calibrated as in the pre-crisis single-curve short rate models. It remains to calibrate *b*3, σ<sup>3</sup> and, possibly the correlation coefficient κ. Via (15) they affect the prices of the fictitious Libor bonds *p*¯(*t*, *T*)that are, however, not observable. One may observe though the FRA rates *Rt* and *R*¯*<sup>t</sup>* and thus also ν*<sup>t</sup>*,*<sup>T</sup>* , as well as ν¯*t*,*<sup>T</sup>* . Via (41) this would then allow one to calibrate also the remaining parameters. This task would turn out to be even simpler if one would have access to the value of κ by other means.

We emphasize that in order to ensure a good fit to the initial bond term structure, a deterministic shift extension of the model or time-dependent coefficients *b<sup>i</sup>* could be considered. We recall also that we have assumed the mean-reversion level equal to zero for simplicity; in practice it would be one more coefficient to be calibrated for each factor Ψ*<sup>i</sup> t* .

# *4.2 Interest Rate Swaps*

We first recall the notion of a (payer) interest rate swap. Given a collection of dates 0 ≤ *T*<sup>0</sup> < *T*<sup>1</sup> < ··· < *Tn* with γ ≡ γ*<sup>k</sup>* := *Tk* − *Tk*−<sup>1</sup> (*k* = 1, ··· , *n*), as well as a notional amount*N*, a payer swap is a financial contract, where a stream of interest payments on the notional *N* is made at a fixed rate *R* in exchange for receiving an analogous stream corresponding to the Libor rate. Among the various possible conventions concerning the fixing for the Libor and the payment dates, we choose here the one where, for each interval [*Tk*−<sup>1</sup>, *Tk* ], the Libor rates are fixed in advance and the payments are made in arrears. The swap is thus initiated at *T*<sup>0</sup> and the first payment is made at *T*1. A receiver swap is completely symmetric with the interest at the fixed rate being received; here we concentrate on payer swaps.

The arbitrage-free price of the swap, evaluated at *t* ≤ *T*0, is given by the following expression where, analogously to *E<sup>T</sup>*+<sup>Δ</sup>{·}, we denote by *ETk* {·} the expectation with respect to the forward measure *QTk* (*k* = 1, ··· , *n*)

$$\begin{split} P^{\mathbb{S}\_{\mathbb{W}}}(t; T\_0, T\_n, R) &= \gamma \sum\_{k=1}^n p(t, T\_k) E^{T\_k} \left\{ L(T\_{k-1}; T\_{k-1}, T\_k) - R | \mathcal{F}\_t \right\} \\ &= \gamma \sum\_{k=1}^n p(t, T\_k) \left( L(t; T\_{k-1}, T\_k) - R \right) . \end{split} \tag{51}$$

For easier notation we have assumed the notional to be 1, i.e. *N* = 1.

We shall next obtain an explicit expression for *PSw*(*t*; *T*0, *Tn*, *R*) starting from the first equality in (51). To this effect, recalling from (24) that *C*<sup>22</sup>(*t*, *T*) = *C*¯ <sup>22</sup>(*t*, *T*), introduce again some shorthand notation, namely

$$\begin{aligned} A\_k &:= \bar{A}(T\_{k-1}, T\_k), B\_k^1 := B^1(T\_{k-1}, T\_k), \\ C\_k^{22} &:= C^{22}(T\_{k-1}, T\_k) = \bar{C}^{22}(T\_{k-1}, T\_k), \ \bar{C}\_k^{33} := \bar{C}^{33}(T\_{k-1}, T\_k). \end{aligned} \tag{52}$$

The crucial quantity to be computed in (51) is the following one

$$\begin{split} &E^{T\_k} \{ \gamma L(T\_{k-1}; T\_{k-1}, T\_k) | \mathcal{F}\_l \} = E^{T\_k} \left\{ \frac{1}{\bar{p}(T\_{k-1}, T\_k)} | \mathcal{F}\_l \right\} - 1 \\ &= e^{A\_k} E^{T\_k} \{ \exp((\kappa + 1) \mathcal{B}\_k^1 \Psi\_{T\_{k-1}}^1 + \mathcal{C}\_k^{22} (\Psi\_{T\_{k-1}}^2)^2 + \bar{\mathcal{C}}\_k^{33} (\Psi\_{T\_{k-1}}^3)^2) | \mathcal{F}\_l \} - 1, \quad (53) \end{split}$$

where we have used the first relation on the right in (30). The expectations in (53) have to be computed under the measures *QTk* , under which, by analogy to (33), the factors have the dynamics

$$\begin{cases} d\boldsymbol{\Psi}\_{t}^{1} = -\left[b^{1}\boldsymbol{\Psi}\_{t}^{1} + (\sigma^{1})^{2}\boldsymbol{\mathcal{B}}^{1}(t, T\_{k})\right]dt + \sigma^{1}dw\_{t}^{1,k} \\ d\boldsymbol{\Psi}\_{t}^{2} = -\left[b^{2}\boldsymbol{\Psi}\_{t}^{2} + 2(\sigma^{2})^{2}C^{22}(t, T\_{k})\boldsymbol{\Psi}\_{t}^{2}\right]dt + \sigma^{2}dw\_{t}^{2,k} \\ d\boldsymbol{\Psi}\_{t}^{3} = -b^{3}\boldsymbol{\Psi}\_{t}^{3}dt + \sigma^{3}dw\_{t}^{3,k}.\end{cases} \tag{54}$$

where *w<sup>i</sup>*,*<sup>k</sup>* , *i* = 1, 2, 3, are independent Wiener processes with respect to *QTk* . A straightforward generalization of (46) to the case where the factor processΨ <sup>1</sup> *<sup>t</sup>* satisfies the following affine Hull–White model

$$d\Psi\_t^1 = (a^1(t) - b^1 \Psi\_t^1)dt + \sigma^1 dW\_t$$

can be obtained as follows

$$E^{\mathcal{Q}}\left\{\exp\left[-\int\_{t}^{T}\delta\Psi\_{\mu}^{1}du - K\Psi\_{T}^{1}\right]\mid\mathcal{J}\_{t}\right\} = \exp[\alpha^{1}(t,T) - \beta^{1}(t,T)\Psi\_{t}^{1}],\quad(55)$$

with

$$\begin{cases} \beta^1(t, T) = Ke^{-b^1(T-t)} - \frac{\delta}{b^1} \left( e^{-b^1(T-t)} - 1 \right) \\ \alpha^1(t, T) = \frac{(\sigma^1)^2}{2} \int\_t^T (\beta^1(u, T))^2 du - \int\_t^T a^1(u)\beta^1(u, T) du. \end{cases} \tag{56}$$

We apply this result to our situation where under *QTk* the process Ψ<sup>1</sup> *<sup>t</sup>* satisfies the first SDE in (54) and thus corresponds to the above dynamics with *a*1(*t*) = −(σ1)2*B*1(*t*, *Tk* ). Furthermore, setting *K* = −(κ + 1) *B*<sup>1</sup> *<sup>k</sup>* and δ = 0, we obtain for the first expectation in the second line of (53)

$$E^{T\_k} \{ \exp(\left(\kappa + 1\right) B\_k^1 \Psi\_{T\_{k-1}}^1 \vert \mathcal{F}\_l \} = \exp[\varGamma^1(t, T\_k) - \rho^1(t, T\_k) \, \Psi\_l^1],\tag{57}$$

with

$$\begin{cases} \rho^1(t, T\_k) &= -(\kappa + 1)B\_k^1 e^{-b^1(T\_k - t)} \\ \Gamma^1(t, T\_k) = \frac{(\sigma^1)^2}{2} \int\_{\mathcal{I}}^{T\_k} \left(\rho^1(u, T\_k)\right)^2 du + (\sigma^1)^2 \int\_{\mathcal{I}}^{T\_k} B^1(u, T\_k)\rho^1(u, T\_k) du. \end{cases} (58)$$

For the remaining two expectations in the second line of (53) we shall use the following:

**Lemma 4.1** *Let a generic process* Ψ*<sup>t</sup> satisfy the dynamics*

$$d\Psi\_t = b(t)\Psi\_t dt + \sigma \, dw\_t \tag{59}$$

*with wt a Wiener process. Then, for all C* ∈ R *such that E<sup>Q</sup>* " exp *C* (Ψ*<sup>T</sup>* )<sup>2</sup> # < ∞*, we have*

$$E^{\mathcal{Q}}\left\{\exp\left[C\left(\Psi\_T\right)^2\right]\mid\mathcal{P}\_t\right\} = \exp\left[\varGamma\left(t,T\right) - \rho(t,T)\left(\Psi\_t\right)^2\right] \tag{60}$$

*with* ρ(*t*, *T*) *and* Γ (*t*, *T*) *satisfying*

$$\begin{cases} \rho\_l(t,T) + \mathcal{D}b(t)\rho(t,T) - \mathcal{D}(\sigma)^2 \left(\rho(t,T)\right)^2 = 0 \; ; \quad \rho(T,T) = -C\\ \Gamma\_l(t,T) = (\sigma)^2 \rho(t,T). \end{cases} \tag{61}$$

*Proof* An application of Itô's formula yields that the nonnegative processΦ*<sup>t</sup>* := (Ψ*t*)<sup>2</sup> satisfies the following SDE

$$d\Phi\_t = \left(\left(\sigma\right)^2 + 2b(t)\,\Phi\_t\right)dt + 2\sigma\sqrt{\Phi\_t}\,dw\_t.$$

We recall that a process Φ*<sup>t</sup>* given in general form by

$$d\Phi\_t = (a + \lambda(t)\Phi\_t)dt + \eta \sqrt{\Phi\_t} \, d\boldsymbol{w}\_t,$$

with *a*, η > 0 and λ(*t*) a deterministic function, is a CIR process. Thus, (Ψ*t*)<sup>2</sup> is equivalent in distribution to a CIR process with coefficients given by

$$
\lambda(t) = 2b(t) \quad , \quad \eta = 2\sigma \quad , \quad a = \left(\sigma\right)^2.
$$

From the theory of affine term structure models (see e.g. Lamberton and Lapeyre [23], or Lemma 2.2 in Grbac and Runggaldier [18]) it now follows that

$$\begin{aligned} E^{\mathcal{Q}} \left\{ \exp \left[ C \left( \Psi\_T \right)^2 \right] \mid \mathcal{F}\_l \right\} &= E^{\mathcal{Q}} \left\{ \exp \left[ C \left. \Phi\_T \right] \mid \mathcal{F}\_l \right\} = \exp \left[ \Gamma \left( t, T \right) - \rho(t, T) \left. \Phi\_l \right] \right] \\ &= \exp \left[ \Gamma \left( t, T \right) - \rho(t, T) \left( \Psi\_l \right)^2 \right] \end{aligned}$$

with ρ(*t*, *T*) and Γ (*t*, *T*) satisfying (61).

**Corollary 4.1** *When b*(*t*) *is constant with respect to time, i.e. b*(*t*) ≡ *b, so that also* λ(*t*) ≡ λ*, then the equations for* ρ(*t*, *T*) *and* Γ (*t*, *T*)*in* (*61*) *admit an explicit solution given by*

$$\begin{cases} \rho(t,T) = \frac{4bh^{2k(T-t)}}{4(\sigma)^2 h e^{2k(T-t)} - 1} \quad \text{with} \quad h := \frac{c}{4(\sigma)^2 C + 4b} \\\ I(t,T) = -(\sigma)^2 \int\_t^T \rho(u,T) du. \end{cases} \tag{62}$$

Coming now to the second expectation in the second line of (53) and using the second equation in (54), we set

$$b(t) := -\left[b^2 + 2(\sigma^2)^2 C^{22}(t, T\_k)\right], \; \sigma := \sigma^2, \; C = C\_k^{22}$$

and apply Lemma 4.1, provided that the parameters *b*<sup>2</sup> and σ<sup>2</sup> of the process Ψ<sup>2</sup> are such that *C* = *C*<sup>22</sup> *<sup>k</sup>* satisfies the assumption from the lemma. We thus obtain

$$E^{T\_k} \{ \exp(C\_k^{22} (\Psi\_{T\_{k-l}}^2)^2) | \mathcal{F}\_l \} = \exp[\Gamma^2(t, T\_k) - \rho^2(t, T\_k)(\Psi\_l^2)^2],\tag{63}$$

with ρ<sup>2</sup>(*t*, *T*), Γ <sup>2</sup>(*t*, *T*) satisfying

$$\begin{cases} \rho\_t^2(t,T) - 2\left[b^2 + 2(\sigma^2)^2 C^2(t, T\_k)\right] \rho^2(t, T) - 2(\sigma^2)^2 (\rho^2(t, T))^2 = 0\\ \rho^2(T\_k, T\_k) = -C\_k^{22} \\ \Gamma^2(t, T) = -(\sigma^2)^2 \int\_t^T \rho^2(u, T) du. \end{cases} \tag{64}$$

Finally, for the third expectation in the second line of (53), we may take advantage of the fact that the dynamics of Ψ<sup>3</sup> *<sup>t</sup>* do not change when passing from the measure *Q* to the forward measure *QTk* . We can then apply Lemma 4.1, this time with (see the third equation in (54))

$$b(t) := -b^{\overline{3}}, \; \sigma := \sigma^{\overline{3}}, \; C = \bar{C}\_k^{33}.$$

and ensuring that the parameters *b*<sup>3</sup> and σ<sup>3</sup> of the process Ψ<sup>3</sup> are such that *C* = *C*¯ <sup>33</sup> *k* satisfies the assumption from the lemma. Since *b*(*t*) is constant with respect to time, also Corollary 4.1 applies and we obtain

$$E^{T\_k} \{ \exp(\bar{C}\_k^{33} (\Psi\_{T\_{k-1}}^3)^2) | \mathcal{F}\_l \} = \exp[I^{\mathcal{I}}(t, T\_k) - \rho^{\mathcal{I}}(t, T\_k)(\Psi\_t^{\mathcal{I}})^2],$$

Derivative Pricing for a Multi-curve Extension … 213

where

$$\begin{cases} \rho^3(t, T\_k) = \frac{-4b^3 h\_k^3 e^{-2b^3(T\_k - t)}}{4(\sigma^3)^2 h\_k^3 e^{-2b^3(T\_k - t)} - 1} \quad \text{with} \quad h\_k^3 = \frac{\ddot{C}\_k^{33}}{4(\sigma^3)^2 \dot{C}\_k^{33} - 4b^3} \\\ I^3(t, T\_k) = -(\sigma^3)^2 \int\_t^{T\_k} \rho^3(u, T\_k) du. \end{cases} \tag{65}$$

With the use of the explicit expressions for the expectations in (53), and taking also into account the expression for *p*(*t*, *T*) in (29), it follows immediately that the arbitrage-free swap price in (51) can be expressed according to the following

**Proposition 4.2** *The price of a payer interest rate swap at t* ≤ *T*<sup>0</sup> *is given by*

$$\begin{split}P^{\mathcal{S}w}(t;T\_{0},T\_{h},R)&=\gamma\sum\_{k=1}^{n}p(t,T\_{k})E^{T\_{k}}\left\{L(T\_{k-1};T\_{k-1},T\_{k})-R|\mathcal{F}\_{l}\right\}\\&=\sum\_{k=1}^{n}p(t,T\_{k})\left(D\_{t,k}e^{-\rho^{1}(t,T\_{k})\Psi^{1}\_{t}-\rho^{2}(t,T\_{k})(\Psi^{2}\_{t})^{2}-\rho^{3}(t,T\_{k})(\Psi^{3}\_{t})^{2}}-(R\gamma+1)\right)\\&=\sum\_{k=1}^{n}\left(D\_{t,k}e^{-A\_{t,k}}e^{-\tilde{B}^{1}\_{t,k}\Psi^{1}\_{t}-\tilde{C}^{22}\_{t,k}(\Psi^{2}\_{t})^{2}-\tilde{C}^{33}\_{t,k}(\Psi^{3}\_{t})^{2}\right.\\&\left.-(R\gamma+1)e^{-A\_{t,k}}e^{-B^{1}\_{t,k}\Psi^{1}\_{t}-\tilde{C}^{22}\_{t,k}(\Psi^{2}\_{t})^{2}}\right),\tag{66} \end{split}$$

*where*

$$\begin{array}{lcl} A\_{t,k} := A(t, T\_k), & B^1\_{t,k} := B^1(t, T\_k), & C^{22}\_{t,k} := C^{22}(t, T\_k) \\ \tilde{B}^1\_{t,k} := B^1\_{t,k} + \rho^1(t, T\_k), & \tilde{C}^{22}\_{t,k} := C^{22}\_{t,k} + \rho^2(t, T\_k), & \tilde{C}^{33}\_{t,k} := \rho^3(t, T\_k) \\ D\_{t,k} := e^{A\_k} \exp[I^1(t, T\_k) + I^2(t, T\_k) + I^3(t, T\_k)], & & \end{array} \tag{67}$$

*with* ρ*<sup>i</sup>* (*t*, *Tk* ), Γ *<sup>i</sup>* (*t*, *Tk* ) (*i* = 1, 2, 3) *determined according to* (*58*)*,* (*64*)*, and* (*65*) *respectively and with Ak as in (52).*

# **5 Nonlinear/optional Interest Rate Derivatives**

In this section we consider the main nonlinear interest rate derivatives with the Libor rate as underlying. They are also called optional derivatives since they have the form of an option. In Sect. 5.1 we shall consider the case of caps and, symmetrically, that of floors. In the subsequent Sect. 5.2 we shall then concentrate on swaptions as options on a payer swap of the type discussed in Sect. 4.2.

# *5.1 Caps and Floors*

Since floors can be treated in a completely symmetric way to the caps simply by interchanging the roles of the fixed rate and the Libor rate, we shall concentrate here on caps. Furthermore, to keep the presentation simple, we consider here just a single caplet for the time interval [*T*, *T* + Δ] and for a fixed rate *R* (recall also that we consider just one tenor Δ). The payoff of the caplet at time *T* + Δ is thus Δ(*L*(*T*; *T*, *T* + Δ) − *R*)+, assuming the notional *N* = 1, and its time-*t* price *PCpl*(*t*; *T* + Δ, *R*) is given by the following risk-neutral pricing formula under the forward measure *Q<sup>T</sup>*+<sup>Δ</sup>

$$P^{Cpl}(t; T + \Delta, R) = \Delta \, p(t, T + \Delta) E^{T + \Delta} \left\{ (L(T; T, T + \Delta) - R)^{+} \mid \mathcal{J}\_t^{\gamma} \right\}.$$

In view of deriving pricing formulas, recall from Sect. 3.3 that, under the (*T* + Δ)− forward measure, at time *T* the factors Ψ*<sup>i</sup> <sup>T</sup>* have independent Gaussian distributions (see (34)) with mean and variance given, for *i* = 1, 2, 3, by

$$E^{T+\Delta} \{ \Psi\_t^i \} = \bar{\alpha}\_t^i = \bar{\alpha}\_t^i(b^i, \sigma^i), \qquad Var^{T+\Delta} \{ \Psi\_t^i \} = \bar{\beta}\_t^i = \bar{\beta}\_t^i(b^i, \sigma^i).$$

In the formulas below we shall consider the joint probability density function of (Ψ<sup>1</sup> *<sup>T</sup>* , Ψ<sup>2</sup> *<sup>T</sup>* , Ψ<sup>3</sup> *<sup>T</sup>* ) under the *T* + Δ forward measure, namely, using the independence of the processes Ψ*<sup>i</sup> <sup>t</sup>* , (*i* = 1, 2, 3),

$$f\_{\left(\Psi\_{\mathcal{V}}^{1},\Psi\_{\mathcal{V}}^{2},\Psi\_{\mathcal{V}}^{3}\right)}(\mathbf{x}\_{1},\mathbf{x}\_{2},\mathbf{x}\_{3}) = \prod\_{i=1}^{3} f\_{\Psi\_{\mathcal{V}}^{i}}(\mathbf{x}\_{i}) = \prod\_{i=1}^{3} \mathcal{N}(\mathbf{x}\_{i},\bar{\alpha}\_{T}^{i},\bar{\beta}\_{T}^{i}),\tag{68}$$

and use the shorthand notation *fi*(·) for *f*Ψ*<sup>i</sup> T* (·) in the sequel. We shall also write *A*¯, *B*<sup>1</sup>,*C*22, *C*¯ <sup>33</sup> for the corresponding functions evaluated at (*T*, *T* + Δ) and given in (28), (27) and (24) respectively.

Setting *R*˜ := 1 + Δ *R*, and recalling the first equality in (30), the time-0 price of the caplet can be expressed as

$$\begin{split} P^{Col}(0; T + \Delta, \boldsymbol{R}) &= \Delta \boldsymbol{p}(0, T + \Delta) E^{T + \Delta} \left\{ (L(T; T, T + \Delta) - \boldsymbol{R})^{+} \right\} \\ &= \boldsymbol{p}(0, T + \Delta) E^{T + \Delta} \left\{ \left( \frac{1}{\bar{\boldsymbol{p}}(T, T + \Delta)} - \bar{\boldsymbol{R}} \right)^{+} \right\} \\ &= \boldsymbol{p}(0, T + \Delta) E^{T + \Delta} \left\{ \left( \boldsymbol{\epsilon}^{\bar{\boldsymbol{A}} + (\kappa + 1)\bar{\boldsymbol{B}}^{\boldsymbol{1}} \boldsymbol{\Psi}\_{I}^{1} + \bar{\boldsymbol{C}}^{2} (\boldsymbol{\Psi}\_{I}^{2})^{2} + \bar{\boldsymbol{C}}^{33} (\boldsymbol{\Psi}\_{I}^{3})^{2} - \bar{\boldsymbol{R}} \right)^{+} \right\} \\ &= \boldsymbol{p}(0, T + \Delta) \int\_{\mathbb{R}^{3}} \left( \boldsymbol{e}^{\bar{\boldsymbol{A}} + (\kappa + 1)\bar{\boldsymbol{B}}^{\boldsymbol{1}} \boldsymbol{x} + \bar{\boldsymbol{C}}^{2} \boldsymbol{y}^{2} + \bar{\boldsymbol{C}}^{3} \boldsymbol{z}^{2}} - \bar{\boldsymbol{R}} \right)^{+} \\ &\qquad \cdot \bar{\boldsymbol{f}}\_{\mathcal{Y}} (\boldsymbol{\varphi}\_{I}^{1}, \boldsymbol{\varphi}\_{I}^{2}, \boldsymbol{\varphi}\_{I}^{3}) (\boldsymbol{x}, \boldsymbol{y}, \boldsymbol{z}) d(\boldsymbol{x}, \boldsymbol{y}, \boldsymbol{z}). \tag{69} \end{split} \tag{60}$$

To proceed, we extend to the multi-curve context an idea suggested in Jamshidian [19] (where it is applied to the pricing of coupon bonds) by considering the function

$$\log(\mathbf{x}, \mathbf{y}, z) := \exp[\bar{A} + (\kappa + 1)B^1 \mathbf{x} + C^{22} \mathbf{y}^2 + \bar{C}^{33} z^2]. \tag{70}$$

Noticing that *C*¯ <sup>33</sup>(*T*, *T* + Δ) > 0 (see (24) together with the fact that *h*<sup>3</sup> > 0 and 2*b*<sup>3</sup> + *h*<sup>3</sup> > 0), for fixed *x*, *y* the function *g*(*x*, *y*,*z*) can be seen to be continuous and increasing for*z* ≥ 0 and decreasing for*z* < 0 with lim*<sup>z</sup>*→±∞ *g*(*x*, *y*,*z*) = +∞. It will now be convenient to introduce some objects according to the following:

**Definition 5.1** Let a set *M* ⊂ R<sup>2</sup> be given by

$$M := \{ (\mathbf{x}, \mathbf{y}) \in \mathbb{R}^2 \mid \mathbf{g}(\mathbf{x}, \mathbf{y}, \mathbf{0}) \le \bar{R} \}\tag{71}$$

and let *M<sup>c</sup>* be its complement. Furthermore, for (*x*, *y*) ∈ *M* let

$$
\bar{z}^1 = \bar{z}^1(\mathfrak{x}, \mathfrak{y}) \,, \quad \bar{z}^2 = \bar{z}^2(\mathfrak{x}, \mathfrak{y}) \, .
$$

be the solutions of *g*(*x*, *y*,*z*) = *R*˜. They satisfy *z*¯<sup>1</sup> ≤ 0 ≤ ¯*z*2.

Notice that, for *z* ≤ ¯*z*<sup>1</sup> ≤ 0 and *z* ≥ ¯*z*<sup>2</sup> ≥ 0, we have *g*(*x*, *y*,*z*) ≥ *g*(*x*, *y*,*z*¯*<sup>k</sup>* ) = *R*˜, and for*z* ∈ (*z*¯<sup>1</sup>,*z*¯<sup>2</sup>), we have *g*(*x*, *y*,*z*) < *R*˜. In *M<sup>c</sup>* we have *g*(*x*, *y*,*z*) ≥ *g*(*x*, *y*, 0) > *R*˜ and thus no solution of the equation *g*(*x*, *y*,*z*) = *R*˜.

In view of the main result of this subsection, given in Proposition 5.1 below, we prove as a preliminary the following:

**Lemma 5.1** *Assuming that the (nonnegative) coefficients b*<sup>3</sup>, σ<sup>3</sup> *in the dynamics* (*10*) *of the factor* Ψ<sup>3</sup> *<sup>t</sup> satisfy the condition*

$$b^3 \ge \frac{\sigma^3}{\sqrt{2}},\tag{72}$$

*we have that* 1 − 2β¯<sup>3</sup> *<sup>T</sup> C*¯ <sup>33</sup> > 0*, where C*¯ <sup>33</sup> = *C*¯ <sup>33</sup>(*T*, *T* + Δ) *is given by* (*24*) *and where* β¯<sup>3</sup> *<sup>T</sup>* <sup>=</sup> (σ3)<sup>2</sup> <sup>2</sup>*b*<sup>3</sup> (<sup>1</sup> <sup>−</sup> *<sup>e</sup>*−2*b*3*<sup>T</sup>* ) *according to* (*34*)*.*

*Proof* From the definitions of β¯<sup>3</sup> *<sup>T</sup>* and *C*¯ <sup>33</sup> we may write

$$1 - 2\bar{\beta}\_T^3 \bar{C}^{33} = 1 - \left(1 - e^{-2b^3 T}\right) \frac{2\left(e^{\Delta h^3} - 1\right)}{2\frac{b^3 h^3}{(\sigma^3)^2} + \frac{b^3}{(\sigma^3)^2} (2b^3 + h^3) \left(e^{\Delta h^3} - 1\right)}.\tag{73}$$

Notice next that *<sup>b</sup>*<sup>3</sup> <sup>&</sup>gt; 0 implies that 1 <sup>−</sup> *<sup>e</sup>*−2*b*3*<sup>T</sup>* <sup>∈</sup> (0, <sup>1</sup>) and that *<sup>b</sup>*3*h*<sup>3</sup> (σ3)<sup>2</sup> ≥ 0. From (73) it then follows that a sufficient condition for 1 − 2β¯<sup>3</sup> *<sup>T</sup> C*¯ <sup>33</sup> > 0 to hold is that

$$2 \le \frac{b^3}{(\sigma^3)^2} \left(2b^3 + h^3\right). \tag{74}$$

Given that, see definition after (24), *h*<sup>3</sup> = 2 (*b*<sup>3</sup>)<sup>2</sup> + 2(σ<sup>3</sup>)<sup>2</sup> ≥ 2*b*3, the condition (74) is satisfied under our assumption (72). - **Proposition 5.1** *Under assumption* (*72*) *we have that the time-*0 *price of the caplet for the time interval* [*T*, *T* + Δ] *and with fixed rate R is given by*

$$\begin{split} \boldsymbol{P}^{\text{Cyl}}(0;T+\Delta,R) &= \boldsymbol{p}(0,T+\Delta) \Big[ \int\_{M} \boldsymbol{e}^{\tilde{A}+(\kappa+1)B^{1}\mathbf{x}+C^{2}(\mathbf{y})^{2}} \\ & \quad \cdot \Big[ \gamma(\tilde{\alpha}\_{T}^{3},\tilde{\beta}\_{T}^{3},\tilde{\mathbf{C}}^{33}) \Big( \Phi(\boldsymbol{d}^{1}(\mathbf{x},\mathbf{y}))+\Phi(-\boldsymbol{d}^{2}(\mathbf{x},\mathbf{y})) \Big) \\ & \quad \cdot \Big[ -\boldsymbol{e}^{\tilde{C}^{33}(\tilde{\boldsymbol{\varepsilon}}^{1}(\mathbf{x},\mathbf{y}))^{2}} \Phi(\boldsymbol{d}^{3}(\mathbf{x},\mathbf{y}))+\boldsymbol{e}^{\tilde{C}^{33}(\tilde{\boldsymbol{\varepsilon}}^{2}(\mathbf{x},\mathbf{y}))^{2}} \Phi(-\boldsymbol{d}^{4}(\mathbf{x},\mathbf{y})) \Big] \\ & \quad \times f\_{1}(\mathbf{x})f\_{2}(\mathbf{y})\mathbf{dx}\mathbf{dy} + \gamma(\tilde{\alpha}\_{T}^{3},\tilde{\beta}\_{T}^{3},\tilde{\mathbf{C}}^{33}) \int\_{M^{c}} \boldsymbol{e}^{\tilde{A}+(\kappa+1)B^{1}\mathbf{x}+C^{2}(\mathbf{y})^{2}} \\ & \quad \times f\_{1}(\mathbf{x})f\_{2}(\mathbf{y})\mathbf{dx}\mathbf{dy} - \tilde{\Lambda} \,\underline{Q}^{T+\Delta} \left[ (\Psi\_{T}^{1},\Psi\_{T}^{2}) \in \mathcal{M}^{c} \right] \Big], \tag{75} \end{split} \tag{76}$$

*where* Φ(·) *is the cumulative standard Gaussian distribution function, M and M<sup>c</sup> are as in Definition 5.1,*

$$\begin{cases} d^1(\mathbf{x}, \mathbf{y}) := \frac{\sqrt{1 - 2\bar{\beta}\_T^3 \bar{C}^{33} \bar{z}^1(\mathbf{x}, \mathbf{y}) - (\bar{\alpha}\_T^3 - \theta \bar{\beta}\_T^3)}}{\sqrt{1 - 2\bar{\beta}\_T^3 \bar{C}^{33} \bar{z}^2(\mathbf{x}, \mathbf{y}) - (\bar{\alpha}\_T^3 - \theta \bar{\beta}\_T^3)}}\\ d^2(\mathbf{x}, \mathbf{y}) := \frac{\sqrt{1 - 2\bar{\beta}\_T^3 \bar{C}^{33} \bar{z}^2(\mathbf{x}, \mathbf{y}) - (\bar{\alpha}\_T^3 - \theta \bar{\beta}\_T^3)}}{\sqrt{\bar{\beta}\_T^3}}\\ d^3(\mathbf{x}, \mathbf{y}) := \frac{\bar{z}^1(\mathbf{x}, \mathbf{y}) - \bar{\alpha}\_T^3}{\sqrt{\bar{\beta}\_T^3}}\\ d^4(\mathbf{x}, \mathbf{y}) := \frac{\bar{z}^2(\mathbf{x}, \mathbf{y}) - \bar{\alpha}\_T^3}{\sqrt{\bar{\beta}\_T^3}} \end{cases} (76)$$

*with* <sup>θ</sup> := <sup>α</sup>¯ <sup>3</sup> *T* 1−1/ <sup>√</sup>1−2β¯ <sup>3</sup> *<sup>T</sup>C*¯ <sup>33</sup> β¯ 3 *T , which by Lemma 5.1 is well defined under the given assumption (72), and with* γ(α¯ <sup>3</sup> *<sup>T</sup>* , β¯<sup>3</sup> *<sup>T</sup>* ,*C*¯ <sup>33</sup>) := *<sup>e</sup>* ( 1 <sup>2</sup> (θ)2β¯ <sup>3</sup> *<sup>T</sup>* − ¯α<sup>3</sup> *<sup>T</sup>* <sup>θ</sup>) <sup>√</sup>1−2β¯ <sup>3</sup> *<sup>T</sup>C*¯ <sup>33</sup> .

*Remark 5.1* Notice that, once the set *M* and its complement *M<sup>c</sup>* from Definition 5.1 are made explicit, the integrals, as well as the probability in (75), can be computed explicitly.

*Proof* On the basis of the sets *M* and *M<sup>c</sup>* we can continue (69) as

$$\begin{split} &P^{\mathbb{C}pl}(0;T+\Delta,R) = p(0,T+\Delta) \int\_{\mathbb{R}^3} \left(e^{\tilde{A}+(\kappa+1)B^1x+C^{22}y^2+\tilde{C}^{33}z^2} - \tilde{R}\right)^+ \\ &\cdot f\_{(\Psi\_T^\perp,\Psi\_T^\perp,\psi\_T^\perp)}(\mathbf{x},\mathbf{y},z)d(\mathbf{x},\mathbf{y},z) \\ &= p(0,T+\Delta) \int\_{M\times\mathbb{R}} \left(e^{\tilde{A}+(\kappa+1)B^1x+C^{22}y^2+\tilde{C}^{33}z^2} - \tilde{R}\right)^+ \\ &\cdot f\_{(\Psi\_T^\perp,\Psi\_T^\perp,\psi\_T^\perp)}(\mathbf{x},\mathbf{y},z)d(\mathbf{x},\mathbf{y},z) \end{split}$$

Derivative Pricing for a Multi-curve Extension … 217

$$+p(0,T+\Delta)\int\_{M^{\varepsilon}\times\mathbb{R}}\left(e^{\tilde{A}+(\kappa+1)B^{1}x+C^{22}y^{2}+\tilde{C}^{3}z^{2}}-\tilde{R}\right)^{+} $$

$$\therefore f\_{(\Psi\_{T}^{1},\Psi\_{T}^{2},\Psi\_{T}^{3})}(x,y,z)d(x,y,z) $$

$$=:P^{1}(0;T+\Delta)+P^{2}(0;T+\Delta). \tag{77}$$

We shall next compute separately the two terms in the last equality in (77) distinguishing between two cases according to whether (*x*, *y*) ∈ *M* or (*x*, *y*) ∈ *M<sup>c</sup>*.

Case (i): For (*x*, *y*) ∈ *M* we have from Definition 5.1 that there exist *z*¯<sup>1</sup>(*x*, *y*) ≤ 0 and *z*¯<sup>2</sup>(*x*, *y*) ≥ 0 so that for *z* ∈ [¯*z*<sup>1</sup>,*z*¯2] we have *g*(*x*, *y*,*z*) ≤ *g*(*x*, *y*,*z*¯*<sup>k</sup>* ) = *R*˜. For *P*<sup>1</sup>(0; *T* + Δ) we now obtain

$$\begin{split} P^1(0; T + \Delta) &= p(0, T + \Delta) \\ &\quad \cdot \int\_M e^{\bar{\lambda} + (\kappa + 1)\mathcal{B}^1 \mathbf{x} + \mathcal{C}^2 \mathbf{y}^2} \Big( \int\_{-\infty}^{\bar{\varepsilon}^1(\mathbf{x}, \mathbf{y})} (e^{\bar{\varepsilon}^{33} \bar{\varepsilon}^2} - e^{\bar{\varepsilon}^{33}(\bar{\varepsilon}^1)^2}) f\_3(\mathbf{z}) d\mathbf{z} \\ &\quad + \int\_{\bar{\varepsilon}^2(\mathbf{x}, \mathbf{y})}^{+\infty} (e^{\bar{\varepsilon}^{33} \bar{\varepsilon}^2} - e^{\bar{\varepsilon}^{33}(\bar{\varepsilon}^2)^2}) f\_3(\mathbf{z}) d\mathbf{z} \Big) f\_2(\mathbf{y}) f\_1(\mathbf{x}) d\mathbf{y} d\mathbf{x}. \end{split} \tag{78}$$

Next, using the results of Sect. 3.3 concerning the Gaussian distribution *f*3(·) = *f*Ψ3 *<sup>T</sup>* (·), we obtain the calculations in (79) below, where, recalling Lemma 5.1, we make successively the following changes of variables: <sup>ζ</sup> := & 1 − 2β¯<sup>3</sup> *<sup>T</sup>C*¯ <sup>33</sup>*z*, θ := α¯ 3 *<sup>T</sup>* (1−1/ <sup>√</sup>1−2β¯ <sup>3</sup> *<sup>T</sup>C*¯ <sup>33</sup>) β¯ 3 *T* , *<sup>s</sup>* := <sup>ζ</sup>−(α¯ <sup>3</sup> *<sup>T</sup>*−θβ¯ <sup>3</sup> *T* ) √<sup>β</sup>¯ <sup>3</sup> *T* and where *d<sup>i</sup>* (*x*, *y*), *i* = 1, ··· , 4 are as defined in (76)

 *<sup>z</sup>*¯1(*x*,*y*) −∞ *eC*¯ <sup>33</sup>*z*<sup>2</sup> *f*3(*z*)*dz* = *<sup>z</sup>*¯1(*x*,*y*) −∞ *eC*¯ <sup>33</sup>*z*<sup>2</sup> <sup>1</sup> & 2πβ¯<sup>3</sup> *T e* − 1 2 (*z*− ¯α<sup>3</sup> *T* )2 β¯ 3 *<sup>T</sup> dz* = *<sup>z</sup>*¯1(*x*,*y*) −∞ 1 & 2πβ¯<sup>3</sup> *T e* − 1 2 ( √<sup>1</sup>−2β¯ <sup>3</sup> *<sup>T</sup> <sup>C</sup>*¯ <sup>33</sup>*z*− ¯α<sup>3</sup> *T* )2 β¯ 3 *<sup>T</sup> e* − α¯ 3 *T* ( √<sup>1</sup>−2β¯ <sup>3</sup> *<sup>T</sup> <sup>C</sup>*¯ <sup>33</sup>−1) β¯ 3 *T z dz* = <sup>√</sup><sup>1</sup>−2β¯ <sup>3</sup> *<sup>T</sup>C*¯ <sup>33</sup>*z*¯1(*x*,*y*) −∞ 1 & 2πβ¯<sup>3</sup> *T e* − 1 2 (ζ− ¯α<sup>3</sup> *T* )2 β¯ 3 *<sup>T</sup> e* − α¯ 3 *<sup>T</sup>* (1−1/ √<sup>1</sup>−2β¯ <sup>3</sup> *<sup>T</sup> <sup>C</sup>*¯ <sup>33</sup>) β¯ 3 *T* <sup>ζ</sup> 1 & 1 − 2β¯<sup>3</sup> *<sup>T</sup>C*¯ <sup>33</sup> *d*ζ <sup>=</sup> <sup>1</sup> & 1 − 2β¯<sup>3</sup> *<sup>T</sup>C*¯ <sup>33</sup> <sup>√</sup><sup>1</sup>−2β¯ <sup>3</sup> *<sup>T</sup>C*¯ <sup>33</sup>*z*¯1(*x*,*y*) −∞ 1 & 2πβ¯<sup>3</sup> *T e* − 1 2 (ζ− ¯α<sup>3</sup> *T* )2 β¯ 3 *<sup>T</sup> e*−θζ*d*ζ <sup>=</sup> *<sup>e</sup>*( <sup>1</sup> <sup>2</sup> (θ)2β¯ <sup>3</sup> *T*− ¯α<sup>3</sup> *<sup>T</sup>* θ) & 1 − 2β¯<sup>3</sup> *<sup>T</sup>C*¯ <sup>33</sup> *<sup>d</sup>*1(*x*,*y*) −∞ 1 <sup>√</sup>2<sup>π</sup> *<sup>e</sup>*<sup>−</sup> *<sup>s</sup>*<sup>2</sup> <sup>2</sup> *ds*<sup>=</sup> *<sup>e</sup>*( <sup>1</sup> <sup>2</sup> (θ)2β¯ <sup>3</sup> *T*− ¯α<sup>3</sup> *<sup>T</sup>* θ) & 1 − 2β¯<sup>3</sup> *<sup>T</sup>C*¯ <sup>33</sup> Φ(*d*<sup>1</sup> (*x*, *y*)). (79)

On the other hand, always using the results of Sect. 3.3 concerning the Gaussian distribution *f*3(·) = *f*Ψ<sup>3</sup> *<sup>T</sup>* (·) and making this time the change of variables <sup>ζ</sup> := (*z*− ¯α<sup>3</sup> *T* ) √<sup>β</sup>¯ <sup>3</sup> *T* , we obtain

$$\int\_{-\infty}^{\bar{\varepsilon}^{1}(\mathbf{x},\mathbf{y})} e^{\bar{\xi}^{33}(\bar{\varepsilon}^{1})^{2}} f\_{3}(z) dz = e^{\bar{\varepsilon}^{33}(\bar{\varepsilon}^{1})^{2}} \int\_{-\infty}^{\bar{\varepsilon}^{1}(\mathbf{x},\mathbf{y})} \frac{1}{\sqrt{2\pi\bar{\beta}\_{T}^{3}}} e^{-\frac{1}{2}\frac{(\mathbf{z}-\bar{\mathbf{z}}\_{T}^{3})^{2}}{\bar{\beta}\_{T}^{3}}} dz$$

$$= e^{\bar{\mathcal{E}}^{33}(\bar{\varepsilon}^{1})^{2}} \int\_{-\infty}^{\bar{\varepsilon}^{3}(\mathbf{x},\mathbf{y})} \frac{1}{\sqrt{2\pi}} e^{-\frac{1}{2}\bar{\zeta}^{2}} d\zeta = e^{\bar{\mathcal{E}}^{33}(\bar{\varepsilon}^{1})^{2}} \Phi(d^{3}(\mathbf{x},\mathbf{y})). \tag{80}$$

Similarly, we have

$$\begin{split} \int\_{\bar{\varepsilon}^{2}(\mathbf{x},\mathbf{y})}^{+\infty} e^{\bar{c}^{3}\bar{\varepsilon}^{2}} f\_{3}(z) dz &= \frac{1}{\sqrt{1 - 2\bar{\beta}\_{T}^{3}\bar{C}^{33}}} e^{(\frac{1}{2}(\theta)^{2}\bar{\beta}\_{T}^{3} - \bar{\alpha}\_{T}^{3}\theta)} \Phi(-d^{2}(\mathbf{x},\mathbf{y})) \\\\ \int\_{\bar{\varepsilon}^{2}(\mathbf{x},\mathbf{y})}^{+\infty} e^{\bar{c}^{33}(\bar{\varepsilon}^{1})^{2}} f\_{3}(z) dz &= e^{\bar{\mathcal{C}}^{33}(\bar{\varepsilon}^{2})^{2}} \Phi(-d^{4}(\mathbf{x},\mathbf{y})). \end{split} \tag{81}$$

Case (ii):We come next to the case (*x*, *y*) ∈ *M<sup>c</sup>*, for which *g*(*x*, *y*,*z*) ≥ *g*(*x*, *y*, 0) > *R*˜. For *P*<sup>2</sup>(0; *T* + Δ) we obtain

*<sup>P</sup>*2(0; *<sup>T</sup>* <sup>+</sup> Δ) <sup>=</sup> *<sup>p</sup>*(0, *<sup>T</sup>* <sup>+</sup> Δ) *<sup>M</sup>c*×<sup>R</sup> *eA*¯+(κ+1)*B*1*x*+*C*22*y*2+*C*¯ <sup>33</sup>*z*<sup>2</sup> <sup>−</sup> *<sup>R</sup>*˜ · *f*3(*z*)*f*2(*y*)*f*1(*x*)*dzdydx* <sup>=</sup> *<sup>p</sup>*(0, *<sup>T</sup>* <sup>+</sup> Δ) *eA*¯ *Mc <sup>e</sup>*(κ+1)*B*1*x*+*C*22*y*<sup>2</sup> *<sup>f</sup>*1(*x*)*f*2(*y*)*dxdy* R *eC*¯ <sup>33</sup>*z*<sup>2</sup> *f*3(*z*)*dz* <sup>−</sup> *RQ*˜ *<sup>T</sup>*+Δ[(Ψ<sup>1</sup> *<sup>T</sup>* , Ψ<sup>2</sup> *<sup>T</sup>* ) <sup>∈</sup> *<sup>M</sup>c*] <sup>=</sup> *<sup>p</sup>*(0, *<sup>T</sup>* <sup>+</sup> Δ) *eA*¯ *Mc <sup>e</sup>*(κ+1)*B*1*x*+*C*22*y*<sup>2</sup> *<sup>f</sup>*1(*x*)*f*2(*y*)*dxdy <sup>e</sup>* ( 1 <sup>2</sup> (θ3)2β¯ <sup>3</sup> *<sup>T</sup>* − ¯α<sup>3</sup> *<sup>T</sup>* <sup>θ</sup>3) & <sup>1</sup> <sup>−</sup> <sup>2</sup>β¯ <sup>3</sup> *<sup>T</sup>C*¯ <sup>33</sup> <sup>−</sup> *RQ*˜ *<sup>T</sup>*+Δ[(Ψ<sup>1</sup> *<sup>T</sup>* , Ψ<sup>2</sup> *<sup>T</sup>* ) <sup>∈</sup> *<sup>M</sup>c*] , (82)

where we computed the integral over R analogously to (79).

Adding the two expressions derived for Cases (i) and (ii), we obtain the statement of the proposition. -

# *5.2 Swaptions*

We start by recalling some of the most relevant aspects of a (payer) swaption. Considering a swap (see Sect. 4.2) for a given collection of dates 0 ≤ *T*<sup>0</sup> < *T*<sup>1</sup> < ··· < *Tn*, a swaption is an option to enter the swap at a pre-specified initiation date *T* ≤ *T*0, which is thus also the maturity of the swaption and that, for simplicity of notation, we assume to coincide with *T*0, i.e. *T* = *T*0. The arbitrage-free swaption price at *t* ≤ *T*<sup>0</sup> can be computed as

$$P^{S \le n}(t; T\_0, T\_n, R) = p(t, T\_0) E^{T\_0} \left\{ \left( P^{S \le} (T\_0; T\_n, R) \right)^+ \mid \mathcal{J}\_t \right\}, \tag{83}$$

where we have used the shorthand notation *PSw*(*T*0; *Tn*, *R*) = *PSw*(*T*0; *T*0, *Tn*, *R*).

We first state the next Lemma, that follows immediately from the expression for ρ<sup>3</sup>(*t*, *Tk* ) and the corresponding expression for *h*<sup>3</sup> *<sup>k</sup>* in (65).

**Lemma 5.2** *We have the equivalence*

$$\rho^3(t, T\_k) > 0 \Leftrightarrow h\_k^3 \in \left(0, \frac{1}{4(\sigma^3)^2 e^{-2b^3(T\_k - t)}}\right). \tag{84}$$

This lemma prompts us to split the swaption pricing problem into two cases:

$$\begin{array}{lcl}\textbf{Case}(\textbf{1}):&h\_k^3 < 0 \text{ or } h\_k^3 > \frac{1}{4(\sigma^3)^2 e^{-2b^3(T\_k - t)}}\\\textbf{Case}(\textbf{2}):&0 < h\_k^3 < \frac{1}{4(\sigma^3)^2 e^{-2b^3(T\_k - t)}}.\end{array} \tag{85}$$

Note from the definition of ρ<sup>3</sup>(*t*, *Tk* ) that *h*<sup>3</sup> *<sup>k</sup>* = <sup>1</sup> <sup>4</sup>(σ3)2*e*−2*b*3(*Tk*−*t*) and that *<sup>h</sup>*<sup>3</sup> *<sup>k</sup>* = 0 would imply *C*¯ <sup>33</sup> *<sup>k</sup>* = 0 which corresponds to a trivial case in which the factor Ψ<sup>3</sup> is not present in the dynamics of the spread *s*, hence the inequalities in Case (1) and Case (2) above are indeed strict.

To proceed, we shall introduce some more notation. In particular, instead of only one function *g*(*x*, *y*,*z*) as in (70), we shall consider also a function *h*(*x*, *y*), more precisely, we shall define here the continuous functions

$$\log(\mathbf{x}, \mathbf{y}, z) := \sum\_{k=1}^{n} D\_{0,k} e^{-A\_{0,k}} e^{-\tilde{B}\_{0,k}^1 x - \tilde{C}\_{0,k}^{22} y^2 - \tilde{C}\_{0,k}^{33} z^2} \tag{86}$$

$$h(\mathbf{x}, \mathbf{y}) := \sum\_{k=1}^{n} (R\gamma + 1)e^{-A\_{0,k}}e^{-B\_{0,k}^{\dagger}\mathbf{x} - C\_{0,k}^{22}\mathbf{y}^{2}},\tag{87}$$

with the coefficients given by (67) for *t* = *T*0. Note that by a slight abuse of notation we write *D*<sup>0</sup>,*<sup>k</sup>* for *DT*0,*<sup>k</sup>* and similarly for other coefficients above, always meaning *t* = *T*<sup>0</sup> in (67). We distinguish the two cases specified in (85):

For Case (1) we have (see (67) and Lemma 5.2) that *C*˜ <sup>33</sup> <sup>0</sup>,*<sup>k</sup>* = ρ3(*T*0, *Tk* ) < 0 for all *k* = 1, ··· , *n*, and so the function *g*(*x*, *y*,*z*) in (86) is, for given (*x*, *y*), monotonically increasing for *z* ≥ 0 and decreasing for *z* < 0 with

$$\lim\_{z \to \pm \infty} \mathbf{g}(x, y, z) = +\infty.$$

For Case (2) we have instead that *C*˜ <sup>33</sup> <sup>0</sup>,*<sup>k</sup>* = ρ3(*T*0, *Tk* ) > 0 for all *k* = 1, ··· , *n* and so the nonnegative function *g*(*x*, *y*,*z*) in (86) is, for given (*x*, *y*), monotonically decreasing for *z* ≥ 0 and increasing for *z* < 0 with

$$\lim\_{z \to \pm \infty} \mathbf{g}(x, y, z) = 0.$$

Analogously to Definition 5.1 we next introduce the following objects:

**Definition 5.2** Let a set *M*¯ ⊂ R<sup>2</sup> be given by

$$\bar{M} := \{ (\mathbf{x}, \mathbf{y}) \in \mathbb{R}^2 \mid \mathbf{g}(\mathbf{x}, \mathbf{y}, \mathbf{0}) \le h(\mathbf{x}, \mathbf{y}) \}. \tag{88}$$

Since *g*(*x*, *y*,*z*) and *h*(*x*, *y*) are continuous, *M*¯ is closed, measurable and connected. Let *M*¯ *<sup>c</sup>* be its complement. Furthermore, we define two functions*z*¯<sup>1</sup>(*x*, *y*) and *z*¯<sup>2</sup>(*x*, *y*) distinguishing between the two Cases (1) and (2) as specified in (85).

Case (1) If (*x*, *y*) ∈ *M*¯ , we have *g*(*x*, *y*, 0) ≤ *h*(*x*, *y*) and so there exist *z*¯<sup>1</sup>(*x*, *y*) ≤ 0 and *z*¯<sup>2</sup>(*x*, *y*) ≥ 0 for which, for *i* = 1, 2,

$$\begin{split} g(\mathbf{x}, \mathbf{y}, \bar{\mathbf{z}}^i) &= \sum\_{k=1}^n D\_{0,k} e^{-A\_{0,k}} e^{-\tilde{B}^1\_{0,k} \mathbf{x} - \tilde{C}^{22}\_{0,k} \mathbf{y}^2 - \tilde{C}^{33}\_{0,k} (\bar{\mathbf{z}})^2} \\ &= \sum\_{k=1}^n (R\gamma + 1) e^{-A\_{0,k}} e^{-B^1\_{0,k} \mathbf{x} - C^{22}\_{0,k} \mathbf{y}^2} = h(\mathbf{x}, \mathbf{y}) \end{split} \tag{89}$$

and, for *z* ∈ [¯ / *z*<sup>1</sup>,*z*¯<sup>2</sup>], one has *g*(*x*, *y*,*z*) ≥ *g*(*x*, *y*,*z*¯*<sup>i</sup>* ). If (*x*, *y*) ∈ *M*¯ *<sup>c</sup>*, we have *g*(*x*, *y*, 0) > *h*(*x*, *y*) so that *g*(*x*, *y*,*z*) ≥ *g*(*x*, *y*, 0) > *h*(*x*, *y*)for all*z* and we have no points corresponding to *z*¯<sup>1</sup>(*x*, *y*) and *z*¯<sup>2</sup>(*x*, *y*) above.

Case (2) If (*x*, *y*) ∈ *M*¯ , we have, as for Case (1), *g*(*x*, *y*, 0) ≤ *h*(*x*, *y*) and so there exist*z*¯<sup>1</sup>(*x*, *y*) ≤ 0 and *z*¯<sup>2</sup>(*x*, *y*) ≥ 0 for which, for*i* = 1, 2, (89) holds. However, this time it is for *z* ∈ [¯*z*<sup>1</sup>,*z*¯<sup>2</sup>] that one has *g*(*x*, *y*,*z*) ≥ *g*(*x*, *y*,*z*¯*<sup>i</sup>* ). If (*x*, *y*) ∈ *M<sup>c</sup>*, then we are in the same situation as for Case (1).

Starting from (83) combined with (66) and taking into account the set *M*¯ according to Definition 5.2, we can obtain the following expression for the swaption price at *t* = 0. As for the caps, here too we consider the joint Gaussian distribution *f*(Ψ<sup>1</sup> *<sup>T</sup>*<sup>0</sup> ,Ψ<sup>2</sup> *<sup>T</sup>*<sup>0</sup> ,Ψ<sup>3</sup> *<sup>T</sup>*<sup>0</sup> )(*x*, *<sup>y</sup>*,*z*) of the factors under the *<sup>T</sup>*0−forward measure *<sup>Q</sup><sup>T</sup>*<sup>0</sup> and we have

Derivative Pricing for a Multi-curve Extension … 221

*<sup>P</sup>Swn*(0; *<sup>T</sup>*0, *Tn*, *<sup>R</sup>*) <sup>=</sup> *<sup>p</sup>*(0, *<sup>T</sup>*0)*E<sup>T</sup>*<sup>0</sup> *<sup>P</sup>Sw*(*T*0; *Tn*, *<sup>R</sup>*) <sup>+</sup> <sup>|</sup>*F*<sup>0</sup> = *p*(0, *T*0) R3 !*<sup>n</sup> k*=1 *<sup>D</sup>*0,*<sup>k</sup> <sup>e</sup>*−*A*0,*<sup>k</sup>* exp(−*B*˜ <sup>1</sup> <sup>0</sup>,*<sup>k</sup> <sup>x</sup>* <sup>−</sup> *<sup>C</sup>*˜ <sup>22</sup> <sup>0</sup>,*<sup>k</sup> <sup>y</sup>*<sup>2</sup> <sup>−</sup> *<sup>C</sup>*˜ <sup>33</sup> <sup>0</sup>,*<sup>k</sup> z* 2 ) −!*<sup>n</sup> k*=1 (*R*<sup>γ</sup> <sup>+</sup> <sup>1</sup>)*e*−*A*0,*<sup>k</sup>* exp(−*B*<sup>1</sup> <sup>0</sup>,*<sup>k</sup> <sup>x</sup>* <sup>−</sup> *<sup>C</sup>*<sup>22</sup> <sup>0</sup>,*<sup>k</sup> y*<sup>2</sup> ) + *f*(Ψ<sup>1</sup> *<sup>T</sup>*<sup>0</sup> ,Ψ<sup>2</sup> *<sup>T</sup>*<sup>0</sup> ,Ψ<sup>3</sup> *<sup>T</sup>*<sup>0</sup> )(*x*, *<sup>y</sup>*,*z*)*dxdydz* = *p*(0, *T*0) *<sup>M</sup>*¯ <sup>×</sup><sup>R</sup> !*<sup>n</sup> k*=1 *<sup>D</sup>*0,*<sup>k</sup> <sup>e</sup>*−*A*0,*<sup>k</sup>* exp(−*B*˜ <sup>1</sup> <sup>0</sup>,*<sup>k</sup> <sup>x</sup>* <sup>−</sup> *<sup>C</sup>*˜ <sup>22</sup> <sup>0</sup>,*<sup>k</sup> <sup>y</sup>*<sup>2</sup> <sup>−</sup> *<sup>C</sup>*˜ <sup>33</sup> <sup>0</sup>,*<sup>k</sup> z* 2 ) −!*<sup>n</sup> k*=1 (*R*<sup>γ</sup> <sup>+</sup> <sup>1</sup>)*e*−*A*0,*<sup>k</sup>* exp(−*B*<sup>1</sup> <sup>0</sup>,*<sup>k</sup> <sup>x</sup>* <sup>−</sup> *<sup>C</sup>*<sup>22</sup> <sup>0</sup>,*<sup>k</sup> y*<sup>2</sup> ) + *f*(Ψ<sup>1</sup> *<sup>T</sup>*<sup>0</sup> ,Ψ<sup>2</sup> *<sup>T</sup>*<sup>0</sup> ,Ψ<sup>3</sup> *<sup>T</sup>*<sup>0</sup> )(*x*, *<sup>y</sup>*,*z*)*dxdydz* + *p*(0, *T*0) *<sup>M</sup>*¯ *<sup>c</sup>*×<sup>R</sup> !*<sup>n</sup> k*=1 *<sup>D</sup>*0,*<sup>k</sup> <sup>e</sup>*−*A*0,*<sup>k</sup>* exp(−*B*˜ <sup>1</sup> <sup>0</sup>,*<sup>k</sup> <sup>x</sup>* <sup>−</sup> *<sup>C</sup>*˜ <sup>22</sup> <sup>0</sup>,*<sup>k</sup> <sup>y</sup>*<sup>2</sup> <sup>−</sup> *<sup>C</sup>*˜ <sup>33</sup> <sup>0</sup>,*<sup>k</sup> z* 2 ) −!*<sup>n</sup> k*=1 (*R*<sup>γ</sup> <sup>+</sup> <sup>1</sup>)*e*−*A*0,*<sup>k</sup>* exp(−*B*<sup>1</sup> <sup>0</sup>,*<sup>k</sup> <sup>x</sup>* <sup>−</sup> *<sup>C</sup>*<sup>22</sup> <sup>0</sup>,*<sup>k</sup> y*<sup>2</sup> ) + *f*(Ψ<sup>1</sup> *<sup>T</sup>*<sup>0</sup> ,Ψ<sup>2</sup> *<sup>T</sup>*<sup>0</sup> ,Ψ<sup>3</sup> *<sup>T</sup>*<sup>0</sup> )(*x*, *<sup>y</sup>*,*z*)*dxdydz* =: *<sup>P</sup>*<sup>1</sup> (0; *<sup>T</sup>*0, *Tn*, *<sup>R</sup>*) <sup>+</sup> *<sup>P</sup>*<sup>2</sup> (0; *T*0, *Tn*, *R*). (90)

We can now state and prove the main result of this subsection consisting in a pricing formula for swaptions for the Gaussian exponentially quadratic model of this paper. We have

**Proposition 5.2** *Assume that the parameters in the model are such that, if h*<sup>3</sup> *<sup>k</sup> belongs to Case (1) in (85) and h*<sup>3</sup> *<sup>k</sup>* > 0*, then h*<sup>3</sup> *<sup>k</sup>* > <sup>1</sup> <sup>4</sup>(σ3)2*e*−2*b*3*Tk . The arbitrage-free price at t* = 0 *of the swaption with payment dates T*<sup>1</sup> < ··· < *Tn such that* γ = γ*<sup>k</sup>* := *Tk* − *Tk*−<sup>1</sup> (*k* = 1, ··· , *n*)*, with a given fixed rate R and a notional N* = 1*, can be computed as follows where we distinguish between the Cases (1) and (2) specified in Definition 5.2.*

**Case (1)** *We have*

$$\begin{split}P^{\mathrm{Swn}}(0,T\_{0},T\_{n},R)&=p(0,T\_{0})\left\{\sum\_{k=1}^{n}e^{-A\_{0,k}}\left[\int\_{\tilde{M}}D\_{0,k}\exp(-\tilde{B}^{1}\_{0,k}\mathbf{x}-\tilde{C}^{22}\_{0,k}\mathbf{y}^{2})\\ &\qquad\cdot\left(\frac{e^{\left(\frac{1}{2}\langle\theta\_{0}\rangle^{2}\tilde{\beta}^{3}\_{T\_{0}}-\bar{\alpha}^{3}\_{T\_{0}}\theta\_{k}\right)}}{\sqrt{1+2\tilde{\beta}^{3}\_{T\_{0}}\tilde{C}^{33}\_{0,k}}}\Phi(d^{1}\_{k}(\mathbf{x},\mathbf{y}))-e^{-\tilde{C}^{33}\_{0,k}(\tilde{C}^{1})^{2}}\Phi(d^{2}\_{k}(\mathbf{x},\mathbf{y}))\right.\\ &\qquad\cdot\left(\frac{e^{\left(\frac{1}{2}\langle\theta\_{0}\rangle^{2}\tilde{\beta}^{3}\_{T\_{0}}-\bar{\alpha}^{3}\_{T\_{0}}\theta\_{k}}{\sqrt{1+2\tilde{\beta}^{3}\_{T\_{0}}\tilde{C}^{33}\_{0,k}}}\Phi(-d^{3}\_{k}(\mathbf{x},\mathbf{y}))-e^{-\tilde{C}^{33}\_{0,k}(\tilde{C}^{2})^{2}}\Phi(-d^{4}\_{k}(\mathbf{x},\mathbf{y}))\right)\right.\end{split}$$

$$\begin{split} & \quad \times f\_{2}(\mathbf{y}) f\_{1}(\mathbf{x}) d\mathbf{y} d\mathbf{x} + \int\_{\tilde{M}^{c}} \Big( D\_{0,k} e^{-\tilde{B}^{1}\_{0,k} \mathbf{x}} - \tilde{C}^{22}\_{0,k} \mathbf{y}^{2} \frac{e^{\left(\frac{1}{2} (\theta\_{0})^{2} \tilde{\beta}^{3}\_{T\_{0}} - \bar{\alpha}^{3}\_{T\_{0}} \theta\_{0} \right)}}{\sqrt{1 + 2\tilde{\beta}^{3}\_{T\_{0}} \tilde{C}^{33}\_{0,k}}} \\ & \quad - (R\gamma + 1) e^{-B^{1}\_{0,k} \mathbf{x} - C^{22}\_{0,k} \mathbf{y}^{2}} \Big) f\_{2}(\mathbf{y}) f\_{1}(\mathbf{x}) d\mathbf{y} d\mathbf{x} \Big] \Big) . \end{split} \tag{91}$$

**Case (2)** *We have*

*<sup>P</sup>Swn*(0; *<sup>T</sup>*0, *Tn*, *<sup>R</sup>*) <sup>=</sup> *<sup>p</sup>*(0, *<sup>T</sup>*0) !*<sup>n</sup> k*=1 *e*−*A*0,*<sup>k</sup> M*¯ *<sup>D</sup>*0,*<sup>k</sup>* exp(−*B*˜ <sup>1</sup> <sup>0</sup>,*<sup>k</sup> <sup>x</sup>* <sup>−</sup> *<sup>C</sup>*˜ <sup>22</sup> <sup>0</sup>,*<sup>k</sup> y*<sup>2</sup> ) *e* ( 1 <sup>2</sup> (θ*<sup>k</sup>* )2β¯ <sup>3</sup> *<sup>T</sup>*0− ¯α<sup>3</sup> *<sup>T</sup>*<sup>0</sup> <sup>θ</sup>*<sup>k</sup>* ) & 1 + 2β¯<sup>3</sup> *T*0 *C*˜ <sup>33</sup> 0,*k* × Φ(*d*<sup>3</sup> *<sup>k</sup>* (*x*, *<sup>y</sup>*)) <sup>−</sup> Φ(*d*<sup>1</sup> *<sup>k</sup>* (*x*, *<sup>y</sup>*)) <sup>−</sup> *<sup>e</sup>*−*C*˜ <sup>33</sup> <sup>0</sup>,*<sup>k</sup>* (*z*¯1)<sup>2</sup> Φ(*d*<sup>4</sup> *<sup>k</sup>* (*x*, *y*)) <sup>−</sup> Φ(*d*<sup>2</sup> *<sup>k</sup>* (*x*, *<sup>y</sup>*))*f*2(*y*)*f*1(*x*)*dydx* + *M*¯ *<sup>c</sup> D*0,*<sup>k</sup> e*−*B*˜ <sup>1</sup> <sup>0</sup>,*<sup>k</sup> x*−*C*˜ <sup>22</sup> <sup>0</sup>,*<sup>k</sup> <sup>y</sup>*<sup>2</sup> *e* ( 1 <sup>2</sup> (θ*<sup>k</sup>* )2β¯ <sup>3</sup> *<sup>T</sup>*0− ¯α<sup>3</sup> *<sup>T</sup>*<sup>0</sup> <sup>θ</sup>*<sup>k</sup>* ) & 1 + 2β¯<sup>3</sup> *T*0 *C*˜ <sup>33</sup> 0,*k* <sup>−</sup> (*R*<sup>γ</sup> <sup>+</sup> <sup>1</sup>)*e*−*B*<sup>1</sup> <sup>0</sup>,*<sup>k</sup> x*−*C*<sup>22</sup> <sup>0</sup>,*<sup>k</sup> y*<sup>2</sup> *<sup>f</sup>*2(*y*)*f*1(*x*)*dydx* . (92)

*The coefficients in these formulas are as specified in* (*67*)*for t* = *T*0*, f*1(*x*), *f*2(*x*) *are the Gaussian densities corresponding to* (*68*) *for T* <sup>=</sup> *<sup>T</sup>*<sup>0</sup> *and the functions d<sup>i</sup> <sup>k</sup>* (*x*, *y*)*, for i* = 1,..., 4 *and k* = 1,..., *n, are given by*

$$\begin{cases} d\_k^1(\mathbf{x}, \mathbf{y}) := \frac{\sqrt{1 + 2\beta\_{I\_0}^3 \bar{C}\_{0,k}^{33} \bar{z}^1(\mathbf{x}, \mathbf{y}) - (\bar{\alpha}\_{I\_0}^3 - \theta\_k \bar{\beta}\_{I\_0}^3)}}{\sqrt{\bar{\beta}\_{I\_0}^3}} \\ d\_k^2(\mathbf{x}, \mathbf{y}) := \frac{\bar{z}^1(\mathbf{x}, \mathbf{y}) - \bar{\alpha}\_{I\_0}^3}{\sqrt{\bar{\beta}\_{I\_0}^3}} \\ d\_k^3(\mathbf{x}, \mathbf{y}) := \frac{\sqrt{1 + 2\bar{\beta}\_{I\_0}^3 \bar{C}\_{0,k}^{33} \bar{z}^2(\mathbf{x}, \mathbf{y}) - (\bar{\alpha}\_{I\_0}^3 - \theta\_k \bar{\beta}\_{I\_0}^3)}}{\sqrt{\bar{\beta}\_{I\_0}^3}} \\ d\_k^4(\mathbf{x}, \mathbf{y}) := \frac{\bar{z}^2(\mathbf{x}, \mathbf{y}) - \bar{\alpha}\_{I\_0}^3}{\sqrt{\bar{\beta}\_{I\_0}^3}} \end{cases} (93)$$

*with* <sup>θ</sup>*<sup>k</sup>* := <sup>α</sup>¯ <sup>3</sup> *T*0 1−1/ & 1+2β¯ <sup>3</sup> *<sup>T</sup>*0*C*˜ <sup>33</sup> 0,*k* β¯ 3 *T*0 *, for k* = 1,..., *n, and where z*¯<sup>1</sup> = ¯*z*<sup>1</sup>(*x*, *y*), *z*¯<sup>2</sup> = *z*¯<sup>2</sup>(*x*, *y*) *are solutions in z of the equation g*(*x*, *y*,*z*) = *h*(*x*, *y*).

*In addition, the mean and variance values for the Gaussian factors*(Ψ <sup>1</sup> *T*0 , Ψ<sup>2</sup> *T*0 , Ψ<sup>3</sup> *T*0 ) *are here given by*

Derivative Pricing for a Multi-curve Extension … 223

$$\begin{cases} \begin{aligned} \bar{\alpha}\_{T\_0}^1 &= e^{-b^1 T\_0} \Psi\_0^1 - \frac{(\sigma^1)^2}{2(b^1)^2} e^{-b^1 T\_0} (1 - e^{2b^1 T\_0}) - \frac{(\sigma^1)^2}{(b^1)^2} (1 - e^{b^1 T\_0}) \\ \bar{\beta}\_{T\_0}^1 &= e^{-2b^1 T\_0} (e^{2b^1 T\_0} - 1) \frac{(\sigma^1)^2}{2(b^1)} \\ \bar{\alpha}\_{T\_0}^2 &= e^{-b^2 T\_0} \Psi\_0^2 \\ \bar{\beta}\_{T\_0}^2 &= e^{-2b^2 T\_0} \int\_0^{T\_0} e^{2b^2 u + 4(\sigma^2)^2 \tilde{C}^{22} (u, T\_0)} (\sigma^2)^2 du \\ \bar{\alpha}\_{T\_0}^3 &= e^{-b^3 T\_0} \Psi\_0^3 \\ \bar{\beta}\_{T\_0}^3 &= e^{-2b^3 T\_0} \frac{(\sigma^3)^2}{2b^3} (e^{2b^3 T\_0} - 1) . \end{aligned} \tag{94}$$

*Remark 5.2* A remark analogous to Remark 5.1 applies here too concerning the sets *M*¯ and *M*¯ *<sup>c</sup>*.

*Proof* First of all notice that, when *h*<sup>3</sup> *<sup>k</sup>* < 0 or *h*<sup>3</sup> *<sup>k</sup>* > <sup>1</sup> <sup>4</sup>(σ3)2*e*−2*b*3*Tk* in Case (1), this implies 1 + 2β¯<sup>3</sup> *T*0 *C*˜ <sup>33</sup> <sup>0</sup>,*<sup>k</sup>* ≥ 0 (in Case (2) we always have 1 + 2β˜<sup>3</sup> *T*0 *C*˜ <sup>33</sup> <sup>0</sup>,*<sup>k</sup>* ≥ 0). Hence, the square-root of the latter expression in the various formulas of the statement of the proposition is well-defined. This can be checked, similarly as in the proof of Lemma 5.1, by direct computation taking into account the definitions of β¯<sup>3</sup> *<sup>T</sup>*<sup>0</sup> in (94) and of *C*˜ <sup>33</sup> <sup>0</sup>,*<sup>k</sup>* in (67) and (65) for *t* = *T*0.

We come now to the statement for:

**Case 1.** We distinguish between whether (*x*, *y*) ∈ *M*¯ or (*x*, *y*) ∈ *M*¯ *<sup>c</sup>* and compute separately the two terms in the last equality in (90).

**(i)** For (*x*, *y*) ∈ *M*¯ we have from Definition 5.2 that there exist *z*¯<sup>1</sup>(*x*, *y*) ≤ 0 and *z*¯<sup>2</sup>(*x*, *y*) ≥ 0 so that, for *z* ∈ [¯ / *z*<sup>1</sup>,*z*¯2], one has *g*(*x*, *y*,*z*) ≥ *g*(*x*, *y*,*z*¯*<sup>i</sup>* ). Taking into account that, under *QT*<sup>0</sup> , the random variables Ψ<sup>1</sup> *T*0 , Ψ<sup>2</sup> *T*0 , Ψ<sup>3</sup> *<sup>T</sup>*<sup>0</sup> are independent, so that we shall write *f*(Ψ<sup>1</sup> *<sup>T</sup>*<sup>0</sup> ,Ψ<sup>2</sup> *<sup>T</sup>*<sup>0</sup> ,Ψ<sup>3</sup> *<sup>T</sup>*<sup>0</sup> )(*x*, *<sup>y</sup>*,*z*) <sup>=</sup> *<sup>f</sup>*1(*x*)*f*2(*y*)*f*3(*z*) (see also (68) and the line following it), we obtain

$$\begin{split}P^{1}(0;T\_{0},T\_{n},R)&=p(0,T\_{0})\Big[\sum\_{k=1}^{n}D\_{0,k}e^{-\lambda\_{0,k}}\int\_{M}\exp(-\tilde{B}^{1}\_{0,k}z-\tilde{C}^{22}\_{0,k}z^{2})\\ &\cdot\Big(\int\_{-\infty}^{\tilde{z}^{1}(x,y)}\exp(-\tilde{C}^{33}\_{0,k}z^{2})f\_{3}(z)dz\\ &\qquad-\int\_{-\infty}^{\tilde{z}^{1}(x,y)}\exp(-\tilde{C}^{33}\_{0,k}(\tilde{z}^{1})^{2})f\_{3}(z)dz\\ &\qquad+\int\_{\tilde{z}^{2}(x,y)}^{+\infty}\exp(-\tilde{C}^{33}\_{0,k}z^{2})f\_{3}(z)dz\\ &\qquad-\int\_{\tilde{z}^{1}(x,y)}^{+\infty}\exp(-\tilde{C}^{33}\_{0,k}(\tilde{z}^{2})^{2})f\_{3}(z)dz\Big]f\_{2}(y)f\_{1}(x)dydx\Big].\end{split}\tag{95}$$

By means of calculations that are completely analogous to those in the proof of Proposition 5.1, we obtain, corresponding to (79)–(81) respectively and with the same meaning of the symbols, the following explicit expressions for the integrals in the last four lines of (95), namely

$$\int\_{-\infty}^{\bar{z}^1(\mathbf{x},\mathbf{y})} e^{-\tilde{\mathcal{C}}\_{0,k}^{33}z^2} f\_3(z)dz = \frac{e^{(\frac{1}{2}(\theta\_k)^2\tilde{\beta}\_{T\_0}^3 - \tilde{\alpha}\_{T\_0}^3\theta\_k)}}{\sqrt{1 + 2\tilde{\beta}\_{T\_0}^3\tilde{\mathcal{C}}\_{0,k}^{33}}}\varPhi(d\_k^1(\mathbf{x},\mathbf{y})),\tag{96}$$

$$\int\_{-\infty}^{\tilde{\varepsilon}^{\pm}(\mathbf{x},\mathbf{y})} e^{-\tilde{C}\_{0,k}^{\otimes}(\tilde{\varepsilon}^{\pm})^2} f\_{\Im}(\mathbf{z}) d\mathbf{z} = e^{-\tilde{C}\_{0,k}^{\otimes}(\tilde{\varepsilon}^{\pm})^2} \Phi(d\_k^2(\mathbf{x},\mathbf{y})),\tag{97}$$

and, similarly,

$$\begin{split} \int\_{\tilde{z}^{2}(\mathbf{x},\mathbf{y})}^{+\infty} e^{-\tilde{C}^{33}\_{0,k}z^{2}} f\_{3}(z) dz &= \frac{e^{(\frac{1}{2}(\theta\_{k})^{2}\tilde{\beta}^{3}\_{r\_{0}}-\tilde{c}^{3}\_{r\_{0}}\theta\_{k})}}{\sqrt{1+2\tilde{\beta}^{3}\_{T\_{0}}\tilde{C}^{33}\_{0,k}}} \Phi(-d\_{k}^{3}(\mathbf{x},\mathbf{y})), \\\\ \int\_{\tilde{z}^{2}(\mathbf{x},\mathbf{y})}^{+\infty} e^{-\tilde{C}^{33}\_{0,k}(\tilde{z}^{2})^{2}} f\_{3}(z) dz &= e^{-\tilde{C}^{33}\_{0,k}(\tilde{z}^{2})^{2}} \Phi(-d\_{k}^{4}(\mathbf{x},\mathbf{y})), \end{split} \tag{98}$$

where the *d<sup>i</sup> <sup>k</sup>* (*x*, *y*), for *i* = 1,..., 4 and *k* = 1,..., *n*, are as specified in (93). **(ii)** If (*x*, *y*) ∈ *M*¯ *<sup>c</sup>* then, according to Definition 5.2 we have *g*(*x*, *y*,*z*) ≥ *g*(*x*, *y*, 0) > *h*(*x*, *y*) for all *z*. Noticing that, analogously to (96),

$$\int\_{\mathbb{R}} e^{-\tilde{C}\_{0,k}^{33}\zeta^2} f\_3(\zeta) d\zeta = \frac{e^{(\frac{1}{2}(\theta\_k)^2\tilde{\beta}\_{\tilde{r}\_0}^3 - \bar{\alpha}\_{\tilde{r}\_0}^3 \theta\_k)}}{\sqrt{1 + 2\bar{\beta}\_{\tilde{r}\_0}^3 \tilde{C}\_{0,k}^{33}}}$$

we obtain the following expression

*<sup>P</sup>*2(0; *<sup>T</sup>*0, *Tn*, *<sup>R</sup>*) <sup>=</sup> *<sup>p</sup>*(0, *<sup>T</sup>*0) !*n k*=1 *e*−*A*0,*<sup>k</sup> <sup>M</sup>*¯ *<sup>c</sup>*×<sup>R</sup> *D*0,*<sup>k</sup> e* <sup>−</sup>*B*˜ <sup>1</sup> <sup>0</sup>,*<sup>k</sup> <sup>x</sup>*−*C*˜ <sup>22</sup> <sup>0</sup>,*<sup>k</sup> <sup>y</sup>*2−*C*˜ <sup>33</sup> <sup>0</sup>,*<sup>k</sup> <sup>z</sup>*<sup>2</sup> − (*R*γ + 1)*e* <sup>−</sup>*B*<sup>1</sup> <sup>0</sup>,*<sup>k</sup> <sup>x</sup>*−*C*<sup>22</sup> <sup>0</sup>,*<sup>k</sup> <sup>y</sup>*<sup>2</sup> *<sup>f</sup>*3(*z*)*f*2(*y*)*f*1(*x*)*dzdydx* = *p*(0, *T*0) !*n k*=1 *e*−*A*0,*<sup>k</sup> D*0,*<sup>k</sup> Mc e* <sup>−</sup>*B*˜ <sup>1</sup> <sup>0</sup>,*<sup>k</sup> <sup>x</sup>*−*C*˜ <sup>22</sup> <sup>0</sup>,*<sup>k</sup> <sup>y</sup>*<sup>2</sup> *<sup>f</sup>*2(*y*)*f*1(*x*)*dydx* × *e* ( 1 <sup>2</sup> (θ*<sup>k</sup>* )2β¯ <sup>3</sup> *T*0 − ¯α<sup>3</sup> *T*0 θ*k* ) & <sup>1</sup> <sup>+</sup> <sup>2</sup>β¯ <sup>3</sup> *T*0 *C*˜ <sup>33</sup> 0,*k* − (*R*γ + 1) *M*¯ *<sup>c</sup> e* <sup>−</sup>*B*<sup>1</sup> <sup>0</sup>,*<sup>k</sup> <sup>x</sup>*−*C*<sup>22</sup> <sup>0</sup>,*<sup>k</sup> <sup>y</sup>*<sup>2</sup> *<sup>f</sup>*2(*y*)*f*1(*x*)*dydx*. (99)

Adding the two expressions in (i) and (ii) we obtain the statement for Case 1. **Case (2).** Also for this case we distinguish between whether(*x*, *y*) ∈ *M*¯ or(*x*, *y*) ∈ *M*¯ *<sup>c</sup>* and, again, compute separately the two terms in the last equality in (90).

**(i)** For (*x*, *y*) ∈ *M*¯ we have that there exist *z*¯<sup>1</sup>(*x*, *y*) ≤ 0 and *z*¯<sup>2</sup>(*x*, *y*) ≥ 0 so that, contrary to Case 1), one has *g*(*x*, *y*,*z*) ≥ *g*(*x*, *y*,*z*¯*<sup>i</sup>* ) when *z* ∈ [¯*z*1,*z*¯<sup>2</sup>]. It follows that

$$\begin{split} P^1(0; T\_0, T\_n, R) &= p(0, T\_0) \left[ \sum\_{k=1}^n D\_{0,k} e^{-A\_{0,k}} \int\_M \exp(-\bar{B}\_{0,k}^{13} x - \tilde{C}\_{0,k}^{22} \chi^2) \right. \\ &\quad \cdot \left( \int\_{z^1(x,y)}^{z^2(x,y)} \exp(-\tilde{C}\_{0,k}^{33} z^2) f\_3(z) dz \\ &\quad - \int\_{z^1(x,y)}^{z^2(x,y)} \exp(-\tilde{C}\_{0,k}^{33} (\bar{z}^1)^2) f\_3(z) dz \int\_M (p) f\_1(x) dy dx \right] \\ &= p(0, T\_0) \left[ \sum\_{k=1}^n D\_{0,k} e^{-A\_{0,k}} \int\_M \exp(-\bar{B}\_{0,k}^1 x - \tilde{C}\_{0,k}^{22} \chi^2) \\ &\quad \cdot \left( \frac{e^{\left(\frac{1}{2}\Phi(q)^2 \tilde{\beta}\_0^1 - \alpha\_0^2 \Phi\_0\right)}}{\sqrt{1 + 2\tilde{\beta}\_0^3 \frac{\Phi\_0^{-3} \Phi\_0}{\alpha\_0^2 k}}} \left( \Phi(d\_k^3(x,y)) - \Phi(d\_k^1(x,y)) \right) \right) \\ &\quad - e^{-\tilde{C}\_{0,k}^{33} (\bar{\varepsilon}^1)^2} \left( \Phi(d\_k^4(x,y)) - \Phi(d\_k^2(x,y)) \right) \Big[ f\_2(y) f\_1(x) dy dx \right]. \end{split} \tag{100}$$

where we have made use of (96) and (97), (98).

**(ii)** For (*x*, *y*) ∈ *M*¯ *<sup>c</sup>* we can conclude exactly as we did it for Case (1) and, by adding the two expressions in (i) and (ii), we obtain the statement also for Case (2).

**Acknowledgements** The KPMG Center of Excellence in Risk Management is acknowledged for organizing the conference "Challenges in Derivatives Markets - Fixed Income Modeling, Valuation Adjustments, Risk Management, and Regulation".

**Open Access** This chapter is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, duplication, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, a link is provided to the Creative Commons license and any changes made are indicated.

The images or other third party material in this chapter are included in the work's Creative Commons license, unless indicated otherwise in the credit line; if such material is not included in the work's Creative Commons license and the respective action is not permitted by statutory regulation, users will need to obtain permission from the license holder to duplicate, adapt or reproduce the material.

# **References**


# **Multi-curve Construction**

# **Definition, Calibration, Implementation and Application of Rate Curves**

**Christian P. Fries**

**Abstract** In this chapter we discuss the definition, construction, interpolation and application of *curves*. We will discuss discount curves, a tool for the valuation of deterministic cash-flows and forward curves, a tool for the valuation of linear cashflows of an index. A curve is mainly a tool to interpolate certain basic financial products (zero coupon bonds, FRAs) with respect to maturity date and fixing date, such that it can be used to value products, which can be represented as linear functions of possibly interpolated values of a discount or forward curve. For this, the chosen interpolation method and interpolation entity plays an important role. Distinguishing forward curves from discount curves (representing the collateralization of the forward) motivates an alternative interpolation method, namely interpolation of the forward value (the product of the forward and the discount factor). In addition, treating forward curves as native curves (instead of representing them by pseudodiscount curves) will avoid other problems, like that of overlapping instruments. Besides the interpolation, we discuss the calibration of the curves for which we give a generic object-oriented implementation in Fries (Curve calibration. Object-oriented reference implementation, 2010–2015, [11]). We give some numerical results, which have been obtained using this implementation and conclude with a remark on how to define term-structure models (analog to a LIBOR market model) based on the definition of the performance index of an accrual account associated with a discount curve.

**Keywords** Multi-curve construction · Interest rate curves· Interest rate curve interpolation · Cross-currency curves · Term structure models

C.P. Fries (B)

DZ BANK AG, Frankfurt, Germany e-mail: email@christian-fries.de

C.P. Fries Department of Mathematics, LMU Munich, Munich, Germany

# **1 Introduction**

Dynamic multi-curve term structure models, as the one discussed in this book, often use given interest rate curves as initial data. The classical (single curve) example is the HJM oder LMM model, where

$$\operatorname{d} \mathbf{f}(t, T) = \mu(t, T)\mathbf{d}t + \Sigma(t, T)\mathbf{d}W(t), \quad f(t\_0, T) = f\_0(T).$$

While research on multi-curve interest rates models was and is very active, see, e.g., [5, 6, 15, 20–22], references therein and the other chapters of in this book, the construction of the initial interest rate curve, here *f*0(*T*), naturally does not get a similar strong attention. However, a good curve construction is of high importance for practitioners, since it has a strong impact on the delta-hedge (that is, the first-order interest rate risk).

The market standard of (forward) curve construction is to calibrate an interpolated curve to given market instruments, often via an iterative procedure (bootstrapping). With respect to the interpolation of (interest rate) forward curves, a common approach is to represent a forward curve in terms of (pseudo-)discount factors (aka. synthetic discount factors) and apply an interpolation scheme on these discount factors. While this approach is in general not backed by an economic concept, it also introduces several (self-made) problems, e.g., the interpolation of (so-called) overlapping instruments, see Sect. 5.3.

In this paper we focus on the curve construction, provide an open source implementation and suggest appealing alternative interpolation schemes motivated from the multi-curve setup: direct interpolation of the forward curve or direct interpolation of the forward value curve, where the forward value is the product of a forward and the associated discount factor. While linear interpolation of the forward is a common scheme,<sup>1</sup> the interpolation of the forward value appears to be a new approach.

Nevertheless, the paper puts both methods on a solid foundation by deriving the schemes from the multi-curve definition of forward curves. Both interpolation schemes ease some of the complications associated with synthetic discount factors.

Once the curves and interpolations are defined, we are considering the problem of calibrating a set of curves to given market quotes. The value of an instrument is in general determined by a whole collection of curves, e.g., one or two discount curves and zero or more forward curves. To simplify the implementation, we define a generalized swap, which allows to represent most calibration instruments (FRAs, swaps, tenor basis swaps, cross-currency swaps, etc.) by a single class.

<sup>1</sup>Some trading systems, like Murex, do offer it as an option.

# **2 Foundations, Assumptions, Notation**

Under well-known assumptions the valuation of a future cash flow can be written as an expectation,<sup>2</sup> that is the time *t*0-value *V*(*t*0) is

$$V(t\_0) = N(t\_0) \cdot \mathcal{E}^{\mathbb{Q}^\mathbb{V}} \left( \frac{V(T)}{N(T)} \mid \mathcal{J}\_{t\_0} \right) \quad \text{for } t\_0 \le T,\tag{1}$$

where *V*(*T*) is the time *T* cash-flow, *N* is the value process of a traded asset (or collateral account) which can serve as a *numéraire* and Q*<sup>N</sup>* is the equivalent martingale measure associated with *N*. Equation (1) is the starting point for *curve construction* in the following sense: If the above valuation formula holds, then the value of a linear function of future cash-flows is the linear function of the values of the single cash flows. In other words: we can represent the valuation of so-called linear products by a *basis* consisting of the values of elementary products. This basis of elementary products is the set of curves, where "the curve" is formed by the parameter *T*.

Note that here and in the following, we consider the valuation for a fixed *t*0. We are not concerned with the description of a dynamic model (describing *t* → *V*(*t*) as a stochastic process).

**Definition 1** Let *I* denote an index, that is, *I*(*T*) is an *F<sup>T</sup>* -measurable random variable and *d* > 0 is some payment offset, then we define the (time *t*0-)valuation curve with respect to *T* as the map

$$T \mapsto \mathcal{C}(T) \coloneqq N(t\_0) \cdot \mathbf{E}^{\mathbb{Q}^N} \left( \frac{I(T)}{N(T+d)} \mid \mathcal{P}\_{t\_0} \right). \tag{2}$$

For *I* ≡ 1 and *d* = 0 the curve in (2) represents the curve of (synthetical) zerocoupon bond prices *T* → *P*(*T*;*t*0), also known as *discount curve*. <sup>3</sup> For arbitrary indices *I* (with fixed payment offset *d*4), the curve *T* → *C*(*T*)/*P*(*T*;*t*0) is known as the *forward curve*. Obviously both curves depend on *N* and *t*0.

Note that the specific stochastic behavior of *I* and *N* does not play a role when looking at *t*<sup>0</sup> only in the sense that we are only interested in the time *t*0-expectation. That is, we could define *t* → *N*(*t*) and *t* → *I*(*t*) to be *F<sup>t</sup>*<sup>0</sup> -measurable for all times *t* and still generate any given discount curve and forward curve, respectively. There

<sup>2</sup>Since we are only considering the linearity of the valuation at a fixed time *t*0, we just require that some fundamental theorem of asset pricing holds, for example, assuming that the price processes are locally bounded semi-martingales and the *no free lunch with vanishing risk* condition holds, [7].

<sup>3</sup>We will use the notation *<sup>P</sup>*(*T*;*t*) (instead of the more common *<sup>P</sup>*(*t*, *<sup>T</sup>*)) for a the time-*<sup>t</sup>* value of zero-coupon bond maturing in *T*, since we consider *t* = *t*<sup>0</sup> as fixed. Sometimes we even drop the argument and just write *P*(*T*). Similar for forward curves. The curves considered here are parametrized by *T* for a fixed time *t*.

<sup>4</sup>In practice the payment offset may depend on *t*<sup>0</sup> and *T* due to business day adjustments. Our implementation handles this, but to ease notation we drop the dependence here.

is no arbitrage constraint with respect to *t* yet, since for different *t* the index *I*(*t*) represents different assets (underlyings).

Thus, with respect to the processes 1/*N* and *I*/*N* we just require that they fulfill regularity assumptions such that (2) exists.5

On the other hand, the interpretation of the curve as a curve of valuations in the sense of (2) does play a role, when we consider the construction of the curve via interpolation of observed market prices. Here, the linearity of the expectation operator E allows to link market prices to different points of the curve.

Curves, like discount curves and forward curves solve, among others, two important problems:


Thus, curves are simply a methodology to interpolate on the cash flows with respect to their payment time.<sup>6</sup> Apart from this, the curves also represent the initial data for advanced term structure models (like the LIBOR market model). Hence, careful construction of curves is also key to (interest rate) derivatives valuation, when interpolated curves are the initial values of a dynamic model.

For details on the evolution of multi-curve construction see the recent book by Henrard, [18] (citing a preprint of the present paper). A very detailed description of multi-curve bootstrapping, which also details market conventions and convexity adjustments of the calibration instruments, can be found in [2]. For market conventions also see [17]. Here, we do not consider a possible convexity adjustment due to different market conventions (they should be part of the valuation formulas) and rather focus on the curves and their interpolation schemes. Also, we do not need to consider a bootstrapping, since we set up the calibration as a system of equations passed to a multi-dimensional optimization algorithm.

Usually (and here), the curves are used to interpolate at the fixed time *t*<sup>0</sup> only. If a curve interpolation should also be used for times*t* > *t*<sup>0</sup> within a dynamic multi-curve model, then this may impose additional constraints on the admissible interpolations schemes. For example, (2) implies that linear interpolation of time-*t* zero-coupon bond prices for *t* > *t*<sup>0</sup> implies linear interpolation of the time-*t* zero-coupon bond prices, which in turn implies a special interpolation of forward rates in a LIBOR market model, see Sect. 19.5 in [10]. In this case the linear interpolation of the discount curve and forward value curve would not introduce an arbitrage violation, given that

<sup>5</sup>For example, let 1/*N* and *I*/*N* be Itô stochastic processes with integrable drift and bounded quadratic variation.

<sup>6</sup>This also applies to forward curve, see below, although in these cases there is also an associated fixing time of an index and it is maybe more consistent to parametrize the curve w.r.t. the fixing of the index.

the interpolation points are the same for all times. In practice term-structure models are often constructed with their own curve interpolations, such that the interpolation used for the initial data differs from the interpolation used for the simulated curves (while the model is still calibrated and arbitrage-free given that the drift is specified accordingly). In the following we focus on the interpolation of the initial data—that is, the time-*t*<sup>0</sup> curves, which is of greater importance for the deltas of interpolated products, where only the linear part matters.

In the above valuation formula (1) it is assumed that *V* and *N* are expressed in the same currency. If the two are in different currency, one of them has to be converted by an exchange rate, which we will denote by *FX*. Let *V* be in currency *U*<sup>2</sup> and the numéraire *N* in currency *U*1, then the valuation formula is given by

$$V(t\_0) = F X^{\frac{U\_2}{U\_1}}(t\_0) \cdot N(t\_0) \cdot \mathbf{E}^{\mathbb{Q}^N} \left( \frac{V(T)}{F X^{\frac{U\_2}{U\_1}}(T) \cdot N(T)} \mid \mathcal{F}\_b \right),$$

where *FX <sup>U</sup>*<sup>2</sup> *<sup>U</sup>*<sup>1</sup> (*t*) denotes the time *t* exchange rate for one unit of currency *U*<sup>1</sup> into one unit of currency *<sup>U</sup>*2. Furthermore, *FX <sup>U</sup>*<sup>1</sup> *<sup>U</sup>*<sup>2</sup> = *FX <sup>U</sup>*<sup>2</sup> *U*1 −<sup>1</sup> .

As discussed in [12], the valuation of a collateralized claim can be written as an expectation with respect to a specific numéraire, namely the collateral account *N* = *N*<sup>C</sup>. <sup>7</sup> We denote the currency of the collateral numéraire by [C]. Let *U* denote the currency of the cash flow *V*(*T*). Assume that the cash flow *V*(*T*) is collateralized by units of *N*<sup>C</sup>. In this case the Eq. (1) holds with the numéraire *N* = *N*<sup>C</sup>, *U*<sup>2</sup> = *U*, *U*<sup>1</sup> = [*C*] (given that *V*(*t*) is the collateral amount in the account *N*<sup>C</sup>).

*Remark 1* From the above we see that collateralization in a different currency can be interpreted twofold:


We will adopt the latter interpretation, which will also make the valuation look more consistent<sup>8</sup>

$$V(t\_0) = N^{U, \mathbf{C}}(t\_0) \cdot \mathrm{E}^{\mathbb{Q}^{U, \mathbf{C}}} \left( \frac{V(T)}{N^{U, \mathbf{C}}(T)} \mid \mathcal{P}\_{\mathbb{h}} \right). \tag{3}$$

Note that this interpretation will then give rise to a new discount curve: the discount curve associated with *N<sup>U</sup>*,<sup>C</sup>, being the discount curve of a foreign currency (*U*) cash flow collateralized by a *C*.

<sup>7</sup>See also [8, 14].

<sup>8</sup>As has been noted in [12], the measures agree, i.e., Q*U*,*N*<sup>C</sup> <sup>=</sup> <sup>Q</sup>*N*<sup>C</sup> .

*Remark 2* For an uncollateralized product the role of the collateral account is taken by the funding account and the corresponding numéraire is the funding account. Since the valuation formulas are identical to the case of a "special" collateral account (agreeing with the funding account), we will consider an uncollateralized product as a product with a different collateralization.

In the following we use the notation *U* for the currency unit of a cash flow, i.e., we may consider *<sup>U</sup>* <sup>=</sup> <sup>1</sup><sup>e</sup> or *<sup>U</sup>* <sup>=</sup> 1\$. We will need this notation only when we consider cross-currency basis swaps. The symbols *V* and *N* (as well as *P* defined below) will denote value processes including the corresponding currency unit, e.g., *<sup>V</sup>*(*t*0) <sup>=</sup> <sup>0</sup>.25e. The symbol *<sup>V</sup>* refers to the value of the product under consideration, while *N* denotes the numéraire, e.g., the OIS accrued collateral account. The symbol *X* denotes a real number while *I* denotes a real valued stochastic process, both can be considered as rates, i.e., unit-less indices, e.g., *X* = 2.5 %. For example *X* will denote the fix rate in a swap, *I* will denote the floating rate index in a swap, *U* denotes the currency unit of the two legs, *N* will be used to define the discount factor and the value of the swap. The value of the swap is then denoted by *V*.

# **3 Discount Curves**

Consider a fixed constant cash flow *X*, paid in currency *U* in time *T*, collateralized by an account C. Since *X* is a constant and the expectation operator is linear, we can express the time-*t*<sup>0</sup> value *V*(*t*0) of this cash flow as

$$V(t\_0) = X \cdot P^{U, \mathbb{C}}(T; t\_0), \tag{4}$$

where

$$P^{U, \mathbf{C}}(T; t\_0) := N^{U, \mathbf{C}}(t\_0) \cdot \mathrm{E}^{\mathbb{Q}^{U, \mathbf{C}}} \left( \frac{1 \cdot U}{N^{U, \mathbf{C}}(T)} \mid \mathcal{P}\_{t\_0} \right) \tag{5}$$

defines the value of a theoretical zero coupon bond. Note that Eq. (4) can be used in two ways. First, for given market prices we may determine *P<sup>U</sup>*,<sup>C</sup>(*T*;*t*0)—that is we calibrate the curve *T* → *P<sup>U</sup>*,<sup>C</sup>(*T*;*t*0). Second, for given *P<sup>U</sup>*,<sup>C</sup>(*T*;*t*0) we may value a constant cash flow.

This defines the discount curve:

**Definition 2** Let *P<sup>U</sup>*,<sup>C</sup>(*T*;*t*0) denote the time *t*<sup>0</sup> value expressed in currency unit *U* of a unit cash-flow of 1 unit of the currency *U* in *T*, collateralized by a collateral account C. In this case we call *T* → *P<sup>U</sup>*,<sup>C</sup>(*T*;*t*0) given by (5) the *discount curve* for cash flows in currency *U* collateralized by the account *C*.

*Remark 3* By assumption (of a frictionless no-arbitrage market, (1)) the value of a fixed constant future cash-flow *X* is a linear function of its amount. Hence, we have that the time *t*<sup>0</sup> value of a cash flow *X* in *T* and currency *U*, collateralized with an account *C* is

$$\mathbf{x} \cdot P^{U, \mathbf{C}}(T; t\_0) .$$

In other words, the discount curve allows us to valuate all fixed (deterministic) cash flows in a given currency, collateralized by a given account.

The discount factor *P<sup>U</sup>*,<sup>C</sup>(*T*;*t*0) represents the price of an (idealized) zero-coupon bond. Although a zero-coupon bond is usually not a market-traded asset, we may represent market-traded coupon bonds as a linear combination of zero-coupon bonds, and vice versa. If C denotes some cash-collateral account, there is no such thing as a collateralized bond, but in that case *P<sup>U</sup>*,<sup>C</sup>(*T*;*t*0) has the natural interpretation of representing the time-*t* value of a collateralized unit currency time-*T* cash flow. In any case, *PU*,<sup>C</sup>(*T*;*t*0) can be considered a linear function of traded asset (within its collateralization scheme).

# **4 Forward Curves**

The same approach can now be applied to a payoff of a cash flow *X* · *I*(*T*1), paid in currency *U* in time *T*<sup>2</sup> (*T*<sup>1</sup> ≤ *T*2), collateralized by account C, where *X* is a constant and *I* is an adapted process representing index.9 Its value is

$$V(t\_0) = N^{U, \mathbf{C}}(t\_0) \cdot \mathrm{E}^{\mathbb{Q}^{\mathbb{N}, \mathbf{C}}} \left( \frac{\mathbf{X} \cdot I(T\_1) \cdot U}{N^{U, \mathbf{C}}(T\_2)} \mid \mathcal{F}\_{t\_0} \right).$$

We can express the value as *<sup>V</sup>*(*t*0) <sup>=</sup> *<sup>X</sup>* · *<sup>F</sup>U*,<sup>C</sup> *<sup>I</sup>* (*T*1, *T*2;*t*0) · *PU*,<sup>C</sup>(*T*2;*t*0), where

$$F\_I^{U, \mathbb{C}}(T\_1, T\_2; t\_0) = N^{U, \mathbb{C}}(t\_0) \cdot \mathbb{E}^{\mathbb{Q}^{\mathbb{N}, \mathbb{C}}} \left( \frac{I(T\_1) \cdot U}{N^{U, \mathbb{C}}(T\_2)} \mid \mathcal{F}\_{t\_0} \right) / P^{U, \mathbb{C}}(T\_2; t\_0). \tag{6}$$

This definition allows us to derive *F<sup>U</sup>*,<sup>C</sup> *<sup>I</sup>* (*T*1, *T*2;*t*0) from given market prices. Conversely, given *<sup>P</sup><sup>U</sup>*,<sup>C</sup>(*T*2;*t*0) and *<sup>F</sup><sup>U</sup>*,<sup>C</sup> *<sup>I</sup>* (*T*1, *T*2;*t*0) we may value all linear payoff functions of *I*(*T*1) paid in *T*2.

In (6) the forward depends on the fixing time *T*<sup>1</sup> and the payment time *T*2. However, the offset of the payment time from the fixing time *d* = *T*<sup>2</sup> − *T*<sup>1</sup> can be viewed as a property of the index (a constant) and hence, the forward represents a curve *<sup>T</sup>* → *<sup>F</sup><sup>U</sup>*,<sup>C</sup> *<sup>I</sup>* (*T*, *T* + *d*;*t*0).

**Definition 3** Let *t* → *I*(*t*) denote an index, that is *I* is an adapted stochastic real valued process. Let

$$V\_I^{U, \mathbb{C}}(T, T + d; t\_0) := N^{U, \mathbb{C}}(t\_0) \cdot \mathrm{E}^{\mathbb{Q}^{N, \mathbb{C}}} \left( \frac{I(T) \cdot U}{N^{U, \mathbb{C}}(T + d)} \mid \mathcal{S}\_{t\_0} \right)$$

<sup>9</sup>Examples for *I* are LIBOR rates or the performance of an EONIA accrual account.

denote the time *t*0-value of a payment of *I*(*T*) paid in *T* + *d* in currency *U*, collateralized by an account C (where *d* ≥ 0). We assume that *I* and *N* is such that the expectation exists for all *T*. Then we define the *forward of a payment of I*(*T*) *paid in T* + *d in currency U*, *collateralized by an account C* as

$$F\_I^{U, \mathbb{C}}(T; t\_0) := \frac{V\_I^{U, \mathbb{C}}(T, T + d; t\_0)}{P^{U, \mathbb{C}}(T + d; t\_0)}.$$

*Remark 4* The forward curve allows us to value a future payment of the index *I* by

$$\left| V\_I^{U, \mathbb{C}}(T, T + d; t\_0) \right| = \left| F\_I^{U, \mathbb{C}}(T; t\_0) \cdot P^{U, \mathbb{C}}(T + d; t\_0) \right| $$

and by assumption (of a frictionless no-arbitrage market, (1)), the forward curve allows us to evaluate all linear cash flows *X* · *I* (in currency *U*, collateralized by an account <sup>C</sup>) by *<sup>X</sup>* · *<sup>F</sup>U*,<sup>C</sup> *<sup>I</sup>* (*T*;*t*0) · *PU*,<sup>C</sup>(*T* + *d*;*t*0).

Note that *FU*,<sup>C</sup> *<sup>I</sup>* is not a classical single curve forward rate, related to some discount curve. Due to our definition of the forward curve, the curve includes all valuation effects related to the index, in particular a possible convexity adjustment. For example: if we would consider an in-arrears index and an in-advance index we would obtain two different forward curves which differ by the in-arrears convexity adjustment!

# *4.1 Performance Index of a Discount Curve (or "Self-Discounting")*

The OIS swap pays the performance of an account, accruing with the overnight rate, that is:

**Definition 4** (*Overnight Index Swap*) Let *N*<sup>C</sup>(*t*) denote the account accruing at the overnight rate *r*(*t*), *N*<sup>C</sup>(*t*0) = 1*U*, i.e. on a given time discretization (accrual periods) {*ti*} *n i*=0

$$N^{\mathbb{C}}(t\_k) := \prod\_{i=0}^k (1 + r(t\_i)\Delta t\_i) \approx \exp\left(\int\_{t\_0}^{t\_k} r(s)ds\right).$$

The overnight index swap pays a fix coupon and receives the performance *I*<sup>C</sup> *<sup>i</sup>* of the accrual account, that is

$$I\_i^{\mathbb{C}}(T\_i, T\_{i+1}) := \frac{N^{\mathbb{C}}(T\_{i+1})}{N^{\mathbb{C}}(T\_i)} - 1.$$

in *Ti*+<sup>1</sup> with a quarterly tenor *T*0, *T*1,....

The time-*t*<sup>0</sup> linear forward of the index above is *<sup>P</sup>U*,C(*Ti*;*t*0)−*PU*,C(*Ti*+1;*t*0) *PU*,C(*Ti*+1;*t*0) (and dividing by *Ti*+<sup>1</sup> − *Ti* this gives the linear forward rate). Hence, this is the same situation as for single curve interest rate theory swaps.

The OIS swap is collateralized with respect to the account *N*<sup>C</sup>. Due to this, it is sometimes called "self-discounted". However, we may give an appealing alternative view, defining the forward curve from the discount curve (and not the other way around):

Let us consider a discount factor curve *P<sup>U</sup>*,<sup>C</sup>(*T*;*t*) as seen in time *t*. The curve allows the definition of a special index, namely the performance rate of the collateral account C in currency *U* over a period of period length *d*:

Let *<sup>I</sup>*<sup>C</sup>(*Ti*) := <sup>1</sup>−*PU*,C(*Ti*+*d*;*Ti*) *PU*,C(*Ti*+*d*;*Ti*) , where *<sup>P</sup>U*,<sup>C</sup>(*Ti* <sup>+</sup> *<sup>d</sup>*; *Ti*) is the discount factor for the maturity *Ti* + *d* as seen in time *Ti*. The index *I*<sup>C</sup>(*Ti*)is the payment we have to receive in *Ti* + *d* collateralized with respect to the collateral account C, such that 1 + *I*<sup>C</sup>(*Ti*) in *Ti*+<sup>1</sup> has the same value as 1 in *Ti*. This index has a special property, namely that its forward can be expressed in terms of the discount factor curve *PU*,<sup>C</sup> too: The time *t*<sup>0</sup> forward of *I*<sup>C</sup>(*Ti*) is *FU*,<sup>C</sup>(*Ti*;*t*0) where

$$\begin{split} \left( {}^{U,\mathbb{C}}(T\_{i};t\_{0}) \cdot {}^{P^{U,\mathbb{C}}}(T\_{i}+d;t\_{0}) \right) &= {}^{U^{U,\mathbb{C}}}(t\_{0}) \cdot {}^{\mathbb{C}}^{\mathbb{Q}^{N,\mathbb{C}}} \left( \frac{{}^{I}{\mathbb{C}}(T\_{i}) \cdot {}^{U}}{{}^{U^{U,\mathbb{C}}}(T\_{i}+d)} \mid {}^{\mathbb{P}}{\mathbb{P}}\_{t\_{0}} \right) \\ &= {}^{P^{U,\mathbb{C}}}(T\_{i};t\_{0}) - {}^{P^{U,\mathbb{C}}}(T\_{i}+d;t\_{0}) . \end{split}$$

Consequently this index has the special property that its forward can be expressed by the associated discount factors evaluated at different maturities.

**Definition 5** (*Forward associated with a Discount Curve*) Let *PU*,<sup>C</sup>(*Ti* + *d*;*t*0) denote a discount curve. For a given period length *d* we define the forward *Fd*,*U*,<sup>C</sup>(*Ti*;*t*0) as

$$F^{d,U,\mathbb{C}}(T\_i; t\_0) := \frac{P^{U,\mathbb{C}}(T\_i; t\_0) - P^{U,\mathbb{C}}(T\_i + d; t\_0)}{P^{U,\mathbb{C}}(T\_i + d; t\_0) \cdot d}. \tag{7}$$

*F<sup>d</sup>*,*U*,<sup>C</sup>(*Ti*;*t*0) is the forward associated with the performance index of *P<sup>U</sup>*,<sup>C</sup> over a period of length *d*.

*Remark 5* The above definition relates a forward curve and discount factor curve. Note however, that we define a forward from a discount factor curve and that this definition is backed by a clear interpretation of the underlying index. Conversely, we may define a discount curve from a forward curve "implicitly" such that the relation (7) holds. Note however, that a generalization of this relation should be considered with care, since the associated product may not exist.

The definition above is an idealization in the sense that we assume that interval points over which the performance is measured correspond to the payment dates. In practice (EONIA is an example) there might be some small deviations from this assumption (e.g. payment offsets of a few days). In this case (7) does not hold (but may be still considered an approximation).

Products like the OIS swaps are sometimes called "self-discounting" since the discounting is performed on a curve corresponding to the index they fix. From the above, we find an alternative (and maybe more natural) interpretation, namely that the swap pays the performance index of its collateral account, i.e., it pays the index associated with the discount curve.

# **5 Interpolation of Curves**

In this section we consider a discount curve *PU*,<sup>C</sup> and an associated forward curve *FU*,<sup>C</sup>. To simplify notation we set *D*(*T*) := *PU*,<sup>C</sup>(*T*;*t*0) and *F*(*T*) := *FU*,<sup>C</sup>(*T*;*t*0).

Forwards and discount factors are linked together by Definition 3, which says that the time-*t*<sup>0</sup> value of a forward contract *V*(*t*0, *T*) with fixing in *T* is the product of the forward *F*(*T*) and the associated discount factor *D*(*T* + *d*), i.e., *V*(*t*0, *T*) = *F*(*T*) · *D*(*T* + *d*). Note that *T* → *V*(*t*0, *T*) and *T* → *D*(*T*) are *value curves*, i.e., for a fixed *T* the quantities *V*(*t*0, *T*) and *D*(*T*) are values of financial products. However, *F*(*T*) is a derived quantity, the forward.

Since *V* and *D* represent values of financial products, there is a natural interpretation for a linear interpolation of different values *V*(*Ti*) and of different values of *D*(*Ti*), since this would correspond to a portfolio of such products. Note that defining an interpolation method for *V* and *D* implies a (possible more complex) interpolation method of *F*.

On the other hand, it is common practice to define an interpolation method for a rate curve (both forward curve and discount factor curve) via zero rates, sometimes even regardless of the nature of the curve, which then implies the interpolation of the value curves*D* and *V* . Some of these interpolations will result in natural interpolations on the value process *V*, others not. Other examples for interpolations of *F* and *D* are:


In [16] interpolations on the discount factors, on the logarithm of discount factors, on the yield and directly on the forwards were discussed. Highlighting some disadvantages of cubic splines, they introduced two new interpolation methods (monotone convex spline and minimal cubic spline) which overcome most of the shortfalls of the other interpolations. In [19] some issues of these methods were pointed out, favoring a harmonic spline interpolation. In [1] a modified Bessel spline on the logarithm of the discount factors was proposed.

Based on the formal setup presented in the present paper, the stability of cumulated error of a dynamic hedge was considered as a criterion for the interpolation methods and compared for a large collection of methods in [13].

At this point, we would like to stress the importance of the *interpolation entity*, that is, whether we interpolate on a forward or on a synthetic discount factor (in the sense of Definition 5). While the interpolation method (e.g., *linear* compared to *spline*) is often in the focus (discussing locality versus smoothness, [16]), the choice of the interpolation entity has a strong impact on the delta hedge, see Table 1.

Depending on the application, it is popular to represent a curve by a parametric curve. This is done especially for discount curves. Examples are the Nelson– Siegel (NS) and the Nelson–Siegel–Svensson (NSS) parametrization. Our benchmark implementation in [11] allows to use NS or NSS in the calibration.<sup>10</sup>

# *5.1 Implementing the Interpolation of a Curve: Interpolation Method and Interpolation Entities*

In this paper we focus on interpolation schemes based on given interpolation points. Implementing the interpolation of a curve that way, it is convenient to distinguish the *interpolation method*, e.g., linear interpolation of interpolation points {(*Ti*, *xi*)}, and the *interpolation entity*, that is, a (bijective) transformation from (*T*, *x*) to the actual curve. For example, for discount curves one might consider a linear interpolation of the zero rate. In this case the interpolation method is *linear interpolation* and the interpolation entity is (*T*, *<sup>x</sup>*(*T*)) <sup>=</sup> (*T*, log(*D*(*T*)) *<sup>T</sup>* ) for *T* > 0, where *D* denotes the discount curve. Given 0 < *Ti* ≤ *T* ≤ *Ti*+<sup>1</sup> and discount factors *D*(*Tj*), a linear interpolation of the zero rates would then imply the interpolation

$$D(T) := \exp\left( \left( \frac{T - T\_i}{T\_{i+1} - T\_i} \frac{\log(D(T\_{i+1}))}{T\_{i+1}} + \frac{T\_{i+1} - T}{T\_{i+1} - T\_i} \frac{\log(D(T\_i))}{T\_i} \right) \cdot T \right).$$

In our benchmark implementation [11], this functionality is provided for a large number of interpolation methods (constant, linear, Akima, spline, etc.) and interpolation entities (value, log-value, log-value-per-time) by the class net.finmath. marketdata.model.curves.Curve. <sup>11</sup> For forward curves we provide two additional interpolation entities: forward and synthetic discount factor (see below).

# *5.2 Interpolation Time*

For both, parametric curves (like NSS) and non-parametric interpolation schemes, it is important to specify the convention used to transform product maturities (dates) to real numbers (time *T*). For example, we might use a daycount convention (like

<sup>10</sup>See http://finmath.net/finmath-lib/apidocs/net/finmath/marketdata/model/curves/DiscountCurve NelsonSiegelSvensson.html.

<sup>11</sup>See http://finmath.net/finmath-lib/apidocs/net/finmath/marketdata/model/curves/Curve.html.

ACT/365) and measure *T* as a daycount fraction between evaluation date and maturity date, that is *T* := dcf(evaluation date, maturity date). Clearly, a change in the time parametrization will change the interpretation of the curve parameters (for a parametric curve). Also, some daycount convention actually introduces non-linear time transformations.

# *5.3 Interpolation of Forward Curves*

#### **5.3.1 The Classical Approach**

For forward curves, a common approach is to consider an interpolation of the forward as an independent entity (like for the discount curve). For interest rate forwards, a popular interpolation scheme (coming from the single curve interpretation of interest rates forwards) is to represent the forward in terms of synthetic discount factors. That is, if *d* denotes a period length associated with the forward and if *F*(*Ti*) is given for *Ti* = *i* · *d*, then one might consider interpolation of (pseudo-)discount factor *<sup>D</sup><sup>F</sup>*(*Ti*) := *<sup>i</sup>*−<sup>1</sup> *<sup>k</sup>*=<sup>0</sup>(<sup>1</sup> <sup>+</sup> *<sup>F</sup>*(*Tk* ) · *<sup>d</sup>*)−1, possibly considering another transformation on *D<sup>F</sup>*(*T*) to define the actual interpolation entity. See [3] for a corresponding multicurves bootstrap algorithm.

It is obvious that this definition of the interpolation entity for forward curve is complex, results in problems for non-equidistant interpolation points and is—without further assumptions—not backed by a meaningful interpretation. First, in a multicurve setup this approach lacks an economic justification. Second, it may introduce problems:


#### **5.3.2 Alternative Interpolation Schemes for Forward Curves**

The definition of the forward curve in the multi-curve setup suggests an appealing alternative for the creation of an interpolated forward: Like a discount factor curve, the curve *V*(*T*) = *F*(*T*) · *D*(*T* + *d*) represents the value of a financial product. Hence, we may consider the interpolation of *V* like we did for the curve *D*. For example, if we consider linear interpolation of the value curve *V*, we interpolate the forward curve *F* by considering the interpolation entity *F*(*T*) · *D*(*T* + *d*) with a given discount curve *D*, i.e., we have

$$F(T) := \frac{1}{D(T+d)} \left( \frac{T-T\_l}{T\_{l+1}-T\_l} F(T\_{l+1})D(T\_{l+1}+d) + \frac{T\_{l+1}-T}{T\_{l+1}-T\_l} F(T\_l)D(T\_l+d) \right)$$

for *Ti* ≤ *T* ≤ *Ti*+<sup>1</sup> and given points *F*(*Tj*).

Given that log-linear interpolation is a popular interpolation scheme for discount curves one may consider log-linear interpolation of *V*. This interpolation scheme has the restriction that the forward is required to be positive. Since negative interest rates are possible, this interpolation scheme is not appropriate for interest rate curves.

# *5.4 Assessment of the Interpolation Method*

The assessment of the quality of performance of an interpolation method is difficult. Some basic criteria (like continuity, locality, etc.) have been reviewed in [16]. Locality, i.e., how does a local change in input data affect the curve, is a desired property from a hedging perspective. In [13] a long-term dynamic hedging is used to asses the performance of an interpolation scheme. The results in [13] suggest that among the local methods, linear interpolation of the forward curve and log-linear interpolation of the discount curve were the best performing schemes when using the cumulated dynamic hedge error as a primary criterion.

# **6 Implementation of the Calibration of Curves**

A curve (discount curve or forward curve) is used to encode values of market instruments. A forward curve together with its associated discount curve, allows to value all linear products (linear payoffs) in the corresponding currency under the corresponding collateralization.

The standard way to calibrate a curve is, hence, to obtain given market values of (linear) instruments (e.g., swaps). For each market value a single "point" in a single curve is calibrated. Hence the total number of calibrated curve interpolation points (aggregated across all curves) equals the number of market instruments.

By "sorting" and combining the calibration instruments, the corresponding equations can be brought into the form of a system of equations with a triangular structure, i.e., the value of the *n*th calibration instrument only depends on the first *n* curve points. This allows for an iterative construction of the curve.

However, here (and in the associated reference implementation [11]) we propose the calibration of the curves using a multi-variate optimization algorithm, like the Levenberg–Marquardt algorithm or a Differential Evolution algorithm. This approach brings several advantages, e.g., the freedom to specify the calibration instruments and the ability to extend the approach to over-determined systems of equations. In addition, we can handle the case of curve-interdependence, for example to calibrate certain discount curves from cross-currency swaps. This comes at the cost of slower performance in terms of required calculation time.

What remains is to specify the valuation equations for the calibration instruments. To simplify implementation, we may generalize the definition of a "swap" comprising plain swaps, tenor basis swaps and cross-currency swaps.

# *6.1 Generalized Definition of a Swap*

Many of the following calibration instruments (from OIS swaps to cross-currency basis-swaps) fit under a generalized definition of a swap. The swap consists of two legs. Each leg consists of several periods [*Ti*, *Ti*+1]. To ease notation, we do not distinguish between period start time, period end time, fixing time of the index and payment time. We assume that for the period [*Ti*, *Ti*+1] index fixing is in *Ti* and payment is in *Ti*+1. This is done purely to ease notation, the generalization to distinguished times is straightforward.

**Definition 6** (*Swap Leg*) A swap leg pays a multiple α of the index *I* fixed in *Ti* plus some fixed payment *X*, both in currency unit *U* collateralized by the collateral account C and paid in time *Ti*+1. Here α and *X* are constants (possibly zero). The value of the swap leg can be expressed in terms of forwards and discount factors as

$$V\_{SwapLog}^{U,\mathbf{C}}(\alpha I, X, \{T\_i\}\_{i=0}^n) = \sum\_{i=0}^{n-1} \left(\alpha F^{U,\mathbf{C}}(T\_i) + X\right) \cdot P^{U,\mathbf{C}}(T\_{i+1}),$$

where *FU*,<sup>C</sup> denotes the forward curve of the index *I* paid in currency *U* collateralized with respect to C and *PU*,<sup>C</sup> denotes the corresponding discount curve.

A swap leg with notional exchange has the payments as in Definition 6 together with an additional payment of −1 in *Ti* and +1 in *Ti*+1. The value of the swap leg with notional exchange can be expressed in terms of forwards and discount factors as

$$\begin{aligned} V\_{SwapL\mathbf{g}}^{U,\mathbf{C}}(\alpha I, X, \{T\_i\}\_{i=0}^n) &= \sum\_{i=0}^{n-1} \left( \left( \alpha F^{U,\mathbf{C}}(T\_i) + X \right) \cdot P^{U,\mathbf{C}}(T\_{i+1}) \right) \\ &+ P^{U,\mathbf{C}}(T\_{i+1}) - P^{U,\mathbf{C}}(T\_i) \Big), \end{aligned}$$

where *F<sup>U</sup>*,<sup>C</sup> denotes the forward curve of the index *I* paid in currency *U* collateralized with respect to C and *P<sup>U</sup>*,<sup>C</sup> denotes the corresponding discount curve.

**Definition 7** (*Swap*) A swap exchanges the payments of two swap legs, the receiver leg and the payer leg. We allow that the legs have different indices, different fixed payments, different payment times, different currency units, but are collateralized with respect to the same account C. The swaps receive a swap leg with value *V <sup>U</sup>*1,<sup>C</sup> *<sup>S</sup>*w*apLeg*(α1*I*1, *X*1,{*T*<sup>1</sup> *i* } *n*1 *<sup>i</sup>*=0) and pay a leg with value *<sup>V</sup> <sup>U</sup>*2,<sup>C</sup> *<sup>S</sup>*w*apLeg*(α2*I*2, *X*2,{*T*<sup>2</sup> *<sup>i</sup>* }). Since the currency unit of the two legs may be different, the value of the swap in currency *U*<sup>1</sup> is

$$V\_{Swap} = V\_{SwapLeg}^{U\_1, \mathbb{C}}(\alpha\_1 I\_1, X\_1, \{T\_i^1\}\_{i=0}^{n\_1}) - V\_{SwapLeg}^{U\_2, \mathbb{C}}(\alpha\_2 I\_2, X\_2, \{T\_i^2\}\_{i=0}^{n\_2}) \cdot FX^{\frac{U\_1}{U\_2}}$$

Many instruments can be represented (and hence valued) in this form. We will now list a few of them.

# *6.2 Calibration of Discount Curve to Swap Paying the Collateral Rate (aka. Self-Discounted Swaps)*

Discount curves can be calibrated to swaps paying the performance index of their collateral account. For example, a swap as in Definition 7 where both legs pay in the same currency *U* = *U*<sup>1</sup> = *U*2. In a receiver swap the receiver leg pays a fixed rate *C*, and the payer leg pays an index *I*. Thus the value of the swap can be expressed in terms of the discount factors *PU*,<sup>C</sup>(*Ti*+1;*t*) only, which allows to calibrate this curve using these swaps. Overnight index swaps are an example.

For the swap paying the performance of the collateral account we have

$$\begin{aligned} X\_1 &= \text{C = const. } = \text{ given, } X\_2 = 0, \\ F\_1^{U\_1, \mathbb{C}}(T\_i^1; t\_0) &= 0, \\ P\_1^{U, \mathbb{C}} &= P^{U, \mathbb{C}} = \text{ calibrated, } P\_2^{U, \mathbb{C}} = P^{U, \mathbb{C}}(T\_{i+1}^2; t\_0) \\ P\_1^{U, \mathbb{C}} &= P^{U, \mathbb{C}} = \text{ calibrated, } P\_2^{U, \mathbb{C}} = P^{U, \mathbb{C}} = \text{calibrated.} \end{aligned}$$

In a situation where the number of interpolation points matches the number of swaps (e.g., a bootstrapping), we calibrate the time *T* discount factor *P<sup>U</sup>*,<sup>C</sup>(*T*;*t*0) with *T* = max(*T*<sup>1</sup> *<sup>n</sup>* , *T*<sup>2</sup> *<sup>n</sup>* ) being the last payment time from a given swap.

# *6.3 Calibration of Forward Curves*

Given a calibrated discount curve *P<sup>U</sup>*,<sup>C</sup> we consider a swap with payments in currency *U* collateralized with respect to the account C, paying some index *I* and receiving some fixed cash flow *C*. An example is swaps paying the 3M LIBOR rate. For such a swap we have

$$\begin{array}{lcl} X\_1 = \text{C = const. = given,} & X\_2 = 0, \\ F\_1^{U\_1, \mathbb{C}}(T\_i^1) = 0, & F\_2^{U\_2, \mathbb{C}}(T\_i^2) = F^{U, \mathbb{C}}(T\_i^2) = \text{calibrated,} \\ P\_1^{U\_1, \mathbb{C}} = P^{U, \mathbb{C}} = \text{given,} & P\_2^{U, \mathbb{C}} = P^{U, \mathbb{C}} = \text{given.} \\ \end{array}$$

From one such swap we calibrate the time *T* forward *F<sup>U</sup>*,<sup>C</sup>(*T*) of *I*(*T*) with *T* = *T*<sup>2</sup> *n*−1 (the last fixing time).

Given a calibrated discount curve *P<sup>U</sup>*,<sup>C</sup> and a calibrate forward curve *F<sup>U</sup>*,<sup>C</sup> <sup>1</sup> belonging to the index *I*1, both in currency *U* and collateralized with respect to the account C, we consider a swap collateralized with respect to the account C, paying some index *I*<sup>2</sup> = *I* in currency *U*, receiving the index *I*<sup>1</sup> in currency *U*. An example is tenor basis swaps paying the 6M LIBOR rate, receiving the 3M LIBOR rate. For such a swap we have

$$\begin{array}{ll} X\_1 = C\_1 = \text{const.} = \text{ given}, & X\_2 = C\_2 = \text{const.} = \text{given}, \\ F\_1^{U\_1, \mathbb{C}}(T\_i^1) = F\_1^{U, \mathbb{C}}(T\_i^1) = \text{given}, F\_2^{U\_2, \mathbb{C}}(T\_i^2) = F\_2^{U, \mathbb{C}}(T\_i^2) = \text{calculated}, \\ P\_1^{U\_1, \mathbb{C}} = P^{U, \mathbb{C}} = \text{given}, & P\_2^{U\_2, \mathbb{C}} = P^{U, \mathbb{C}} = \text{given}. \end{array}$$

From one such swap we calibrate the time *T* forward *FU*,<sup>C</sup> <sup>2</sup> (*T*) of *I*(*T*) with *T* = *T*<sup>2</sup> *n*−1 (the last fixing time of index *I*2).

# *6.4 Calibration of Discount Curves When Payment and Collateral Currency Differ*

#### **6.4.1 Fixed Payment in Other Currency**

Given a calibrated discount curve *PU*1,<sup>C</sup> we consider a swap collateralized with respect to the account C, paying some index *I*<sup>1</sup> in currency *U*1, and receiving some fixed cash flow *C*<sup>2</sup> in currency *U*2. An example for such a swap is a cross-currency swap paying floating index *I* in collateral currency and receiving fixed*C*<sup>2</sup> in a different currency.<sup>12</sup> For such a swap we have

$$\begin{array}{ll} X\_1 = C\_1 = \text{const.} = \text{ given}, & X\_2 = C\_2 = \text{const.} = \text{ given}, \\ F\_1^{U\_1, \mathbb{C}}(T\_i^1) = F\_1^{U\_1, \mathbb{C}}(T\_i^1) = \text{ given}, F\_2^{U\_2, \mathbb{C}}(T\_i^2) = 0, \\ P\_1^{U\_1, \mathbb{C}} = P^{U\_1, \mathbb{C}} = \text{ given}, & P\_2^{U\_2, \mathbb{C}} = P^{U\_2, \mathbb{C}} = \text{calibrated.} \end{array}$$

We calibrate the discount factor *P<sup>U</sup>*2,<sup>C</sup>(*T*;*t*0) with *T* = *T*<sup>2</sup> *<sup>n</sup>* (last payment time in currency *U*2).

<sup>12</sup>Usually cross-currency swaps exchange two floating indices, we will consider this case below.

#### **6.4.2 Float Payment in Other Currency**

If instead of a fixed payment we have that an index *I*<sup>2</sup> is paid in an other currency *U*<sup>2</sup> we may encounter the problem that the swap has two unknowns, namely the discount curve *P<sup>U</sup>*2,<sup>C</sup> for payments in currency *U*<sup>2</sup> collateralized with respect to C and the forward curve *F<sup>U</sup>*2,<sup>C</sup> <sup>2</sup> of the index *I*<sup>2</sup> paid in currency *U*<sup>2</sup> collateralized with respect to C. The two curves can be obtained jointly from two different swaps: first a fix-versus-float swaps in currency *U*<sup>2</sup> collateralized by C, and second, a crosscurrency swap exchanging the index *I*<sup>2</sup> with an index *I*<sup>1</sup> in currency *U*<sup>1</sup> for which the forward *FU*1,<sup>C</sup> <sup>1</sup> is known. For the first instrument we denote the fixed payment by *C*1, *C*2. For the second instrument we denote the fixed payment by *s*1, *s*<sup>2</sup> (usually a spread). For the first instrument we have

$$\begin{array}{lcl} X\_1 = \mathcal{C}\_1 = \text{const.} = \text{ given}, X\_2 = \mathcal{C}\_2 = \text{const.} = \text{given}, \\ F\_1^{U\_1, \mathbb{C}}(T\_i^1) = 0, & F\_2^{U\_2, \mathbb{C}}(T\_i^2) = F\_2^{U\_2, \mathbb{C}}(T\_i^2) = \text{calibrated}, \\ P\_1^{U\_1, \mathbb{C}} = P\_2^{U\_2, \mathbb{C}} = \text{calibrated}, P\_2^{U\_2, \mathbb{C}} = \text{calibrated}. \end{array}$$

For the second swap we have

$$\begin{array}{llll} X\_1 = s\_1 = \text{const.} = \text{ given}, & X\_2 = s\_2 = \text{const.} = \text{given}, \\ F\_1^{U\_1, \mathbb{C}}(T\_i^1) = F\_1^{U\_1, \mathbb{C}}(T\_i^1) = \text{given}, F\_2^{U\_2, \mathbb{C}}(T\_i^2) = F\_2^{U\_2, \mathbb{C}}(T\_i^2) = \text{calibrated}, \\ P\_1^{U\_1, \mathbb{C}} = \text{given}, & P\_2^{U\_2, \mathbb{C}} = \text{calibrated}. \end{array}$$

We calibrate the discount factor *PU*2,<sup>C</sup>(*T*;*t*0) with *T* = *T*<sup>2</sup> *<sup>n</sup>* and the forward *<sup>F</sup>U*2,<sup>C</sup> <sup>2</sup> (T) with *T* = *T*<sup>2</sup> *<sup>n</sup>*−1.

Often market data are not available to calibrate the forward *FU*2,<sup>C</sup> <sup>2</sup> , but the forward *FU*2,C<sup>2</sup> <sup>2</sup> collateralized with respect to a different account C<sup>2</sup> is available. The two forwards differ by a possible convexity adjustment. One possible approximation (which would follow from the assumption that forwards are independent of their collateralization) is to use *F<sup>U</sup>*2,<sup>C</sup> <sup>2</sup> <sup>≈</sup> *<sup>F</sup><sup>U</sup>*2,C<sup>2</sup> <sup>2</sup> .

The joint calibration of the two curves can be decomposed into two independent calibration steps, which would then allow to re-use a traditional bootstrap algorithm, see, e.g., [4].

#### *Calibration of Discount Curves as Spread Curves*

We consider a swap leg with notional exchange and tenor {*Ti*} *n <sup>i</sup>*=<sup>0</sup>, paying an index *I* plus some constant *X* = *s*(*Tn*) = *const*. Here *s*(*Tn*) has the interpretation of a maturity-dependent spread. If this leg is in currency *U* and with respect to a collateral account (here funding account) D, then its value is

$$\begin{aligned} V\_{SwapLeg}^{U,\mathsf{D}}(\alpha I, X, \{T\_i\}\_{i=0}^n) &= \sum\_{i=0}^{n-1} \left( (\alpha F^{U,\mathsf{D}}(T\_i) + X) \cdot P^{U,\mathsf{D}}(T\_{i+1}) \right) \\ &+ P^{U,\mathsf{D}}(T\_{i+1}) - P^{U,\mathsf{D}}(T\_i)) \cdot \mathsf{L} \end{aligned}$$

An example of such an instrument is an (uncollateralized) floating rate bond, paying a 3M rate plus some spread. If we assume that the forward *F<sup>U</sup>*,<sup>D</sup>(*Ti*) is known, this instrument can be used to calibrate the discount curve *P<sup>U</sup>*,<sup>D</sup>. In fact *I* + *X* represents the performance of the funding account associated with *P<sup>U</sup>*,<sup>D</sup>.

If the forward *F<sup>U</sup>*,<sup>D</sup>(*Ti*) is not known, we encounter the same problem as for cross-currency swaps, namely that the forward curve *F<sup>U</sup>*,<sup>D</sup>(*Ti*) and the discount curve *P<sup>U</sup>*,<sup>D</sup> need to be calibrated jointly to two instruments. The first one is a swap which is collateralized with respect to the funding account D, i.e., it is an uncollateralized swap. The second is the funding floater.

For the first instrument, the uncollateralized swap, we have

$$\begin{array}{lcl} X\_1 = \mathcal{C}\_1 = \text{const.} = \text{given}, X\_2 = \mathcal{C}\_2 = \text{const.} = \text{given}, \\ F\_1^{U, \mathbf{D}}(T\_i^1) = 0 = \text{given}, \quad F\_2^{U, \mathbf{D}}(T\_i^2) = F^{U, \mathbf{D}}(T\_i^2) = \text{calibrated}, \\ P\_1^{U, \mathbf{D}} = P^{U, \mathbf{D}} = \text{calibrated}, \quad P\_2^{U, \mathbf{D}} = P^{U, \mathbf{D}}. \end{array}$$

For the second instrument, the funding floating rate bond (uncollateralized swap leg with notional exchange) we have

$$\begin{aligned} X\_1 &= S = \text{const.} = \text{given}, \\ F\_1^{U, \mathsf{D}}(T\_i^1) &= F^{U, \mathsf{D}}(T\_i^1) = \text{ calibrated}, \\ P\_1^{U, \mathsf{D}} &= P^{U, \mathsf{D}} = \text{calibrated.} \end{aligned}$$

*Remark 6* The calibration of the funding curve *PU*,<sup>D</sup> is analog to the calibration of the cross-currency discount curve *PU*2,<sup>C</sup>.

In the above, we consider the *funding floater* as a floating rate bond. Note however, that bonds (in contrast to swaps) do not permit negative coupons, hence they have an implicit floor. There are ways to solve this problem: either one has to incorporate an option premium in the calibration procedure (which does require a model for the volatility) or one considers only market data of fixed bonds together with uncollateralized swaps (which likely requires some assumption since usually this calibration instrument is not observed). See the following section.

# *6.5 Lack of Calibration Instruments (for Difference in Collateralization)*

The calibration of cross-currency curves (forward curve and discount curves for currency *U*<sup>2</sup> with collateralization in currency *U*1, see Sect. 6.4) and the calibration of un-collateralized curves (forward curves and discount curves for uncollateralized products, see section "Calibration of Discount Curves as Spread Curves") may require market data which are not available, e.g., the forward of an index *I* paid in currency *U*<sup>2</sup> collateralized in a different currency or by a different account. This issue has been pointed out by [14].

In this case the curve can be obtained by adding additional assumptions. Two simple examples are:


The two assumptions lead to different results, since they imply different correlations which will lead to different (convexity) adjustments. For details on the example see [11], where a sample calculation with assuming identical market rates for 3M swaps collateralized in USD-OIS or EUR-OIS results in a difference of around 1 or 2 basis points (0.01%) for the forward curves.

# *6.6 Implementation*

The definition of the various calibration instruments indicated that an iterative bootstrapping algorithm (there the curve is built in a step-by-step process solving only one dimensional problems in one variable) is no longer straightforward. This is due to the interdependence of discount and forward curves. While this problem may be solved in some cases via a pre-processing (see [4]), we suggest a different route: we propose to solve the calibration problem via a single optimization run on the full multi-dimensional problem. This also allows to calibrate curve in the sense of a best fit in cases where we use more calibration instruments than curve points, resulting in an overdetermined system.

We provide an object-oriented implementation at [11] implementing the Java classes for Curves,DiscountCurves,ForwardCurves,Solver,SwapLeg and Swap.

A detailed discussion of the implementation can be found in the associated JavaDocs and is left out here to shorten the presentation.

# **7 Redefining Forward Rate Market Models**

Having discussed the setup of curves, we would like to conclude with a remark on how the curves are integrated into term-structure models, specifically, how the multicurve setup harmonizes with a classical single curve standard LIBOR market model, which can then be extended to a fully multi-curve model.

If *N*<sup>C</sup> denotes an accrual account, i.e., *N*<sup>C</sup> is a process with *N*<sup>C</sup>(*t*0) = 1*U* (e.g., a collateral account), then*N*<sup>C</sup> defines a discount curve, namely the discount curve *T* → *P<sup>U</sup>*,<sup>C</sup>(*T*;*t*0) =: *P*<sup>C</sup>(*T*;*t*0) of fixed payments made in *T*, valued in *t* and collateralized by units of *N*<sup>C</sup>.

Now let {*Ti*} denote a given tenor discretization. As shown in Sect. 4.1 the period-[*Ti*, *Ti*+<sup>1</sup>] performance index *I*<sup>C</sup>(*Ti*, *Ti*+<sup>1</sup>) of the an accrual account, i.e., *<sup>I</sup>*<sup>C</sup>(*Ti*, *Ti*+<sup>1</sup>; *Ti*) := *<sup>N</sup>*C(*Ti*+1) *<sup>N</sup>*C(*Ti*) − 1 has the property that its time *t forward* (of a payment of *I*<sup>C</sup>(*Ti*, *Ti*+1) made in *Ti*+1, collateralized in units of *N*<sup>C</sup>) (following the definition of a forward from Definition 3) is given as *F<sup>U</sup>*,<sup>C</sup>(*Ti*, *Ti*+<sup>1</sup>;*t*) := *<sup>P</sup>*C(*Ti*;*t*)−*P*C(*Ti*+1;*t*) *<sup>P</sup>*C(*Ti*+1;*t*0) .

This relation allows us to create a term-structure model for the curve *P*<sup>C</sup> which has the same structural properties as a standard single curve (LIBOR) market model. This model is given by a joint modeling of the processes *Li*(*t*) := *<sup>F</sup>U*,C(*Ti*,*Ti*+1;*t*) *Ti*+1−*Ti* , e.g., as log-normal processes under the measure Q*<sup>N</sup>*<sup>C</sup> and the additional assumption that the process *P*<sup>C</sup>(*Ti*;*t*) is deterministic on its short period *t* ∈ (*Ti*−1, *Ti*].

From these two assumptions it follows that the processes *Li* have the structure of a standard LIBOR market model and Q*N*<sup>C</sup> corresponds to the spot measure. Indeed we have *<sup>i</sup>*−<sup>1</sup> *<sup>j</sup>*=<sup>0</sup> <sup>1</sup> <sup>+</sup> *Lj*(*Tj*) · (*Tj*+<sup>1</sup> <sup>−</sup> *Tj*) <sup>=</sup> *<sup>N</sup>*<sup>C</sup>(*Ti*).

What we have described is how to use the standard LIBOR market model as a term structure model for the collateral account *N*<sup>C</sup> (e.g., the OIS curve). Now, modeling all other rates (including LIBOR) can be performed by modeling (possibly stochastic) spreads over this curve. This is analog to a defaultable market model.

An alternative is to start with a stochastic model for the forward rates, where now the forward curve defines the initial value of the model SDEs, and then define the discount curve (numéraire) via deterministic or stochastic spreads. This approach has a practical advantage, since for LIBOR rates implied volatilities are more liquid than for OIS rates. See, e.g., [20] and references therein. An implementation of the standard LMM with a deterministic adjustment for the discount curve is provided by the author at [9].

# **8 Some Numerical Results**

# *8.1 Impact of the Interpolation Entity of a Forward Curve on the Delta Hedge*

Using our reference implementation [11], we investigate the interpolation of forward curves using different interpolation methods and interpolation entities.While interpolation of (synthetic) discount factors is—motivated from its single curve origin—a very popular interpolation method, it may result in very implausible deltas, if the curve is constructed from overlapping instruments. Table 1 shows the delta of an 8x11 FRA calculated on a curve constructed from 0x3, 1x4, 2x5, 3x6, 4x7, 5x8, 6x9, 7x10, 9x12 FRA (note that the 8x11 is missing in the curve construction). The plausible hedge would be to use the adjacent 7x10 and 9x12 FRAs. Using the interpolation entity DISCOUNTFACTOR we find non-zero deltas for instruments prior to the 7x10 FRA, summing up to zero. This effect stems from the error propagation inherent in the definition of the interpolation entity. The interpolation entity FORWARD does not show this effect.


**Table 1** The delta of an 7Mx10M FRA with respect to different calibration instruments, where the 7Mx10M FRA is not part of the calibration instruments, hence interpolates

Different interpolation entities result in very different delta hedges. The popular interpolation entity of a synthetic discount factor results in counterintuitive hedges. The interpolation method is LINEAR in both cases. It is the choice of the interpolation entity which introduces the effect

# *8.2 Impact of the Lack of Calibration Instruments for the Case of a Foreign Swap Collateralized in Domestic Currency*

Based on the curve framework and the calibration instruments defined in this paper and implemented at [11] we have investigated the impact of the assumptions, which had to be made due to the lack of calibration instruments for foreign currency swaps. Since a foreign currency swap collateralized in domestic currency is (currently) not a liquid instrument, the *foreign forward with respect to domestic collateralization* cannot be calibrated. Hence, a model assumption is required. Two possible assumptions are: (1) the forward rate is independent from its collateralization—that is, use the foreign forward curve derived from instruments collateralized in *foreign* currency, or, (2) the market (swap) rates are independent from its collateralization—that is, use the foreign market (par-)swap rates form foreign currency swaps collateralized in foreign currency together with a domestic currency discount curve to calibrate a foreign currency forward rate curve with respect to domestic collateralization. Both approaches result in different forward curves. The impact can be assessed using the spreadsheet available at [11]. For 2012 market data the difference for an USD forward curve collateralized in EUR can be found to be around two basis points. While the first assumption (re-using the forward curve) is likely the more natural one, and maybe a market standard, the calculation shows that the assumption has a considerable impact on the resulting curve, see Fig. 1.

**Fig. 1** Forward curve (USD-3M) calibrated from swaps with different collateralization (USD-OIS and EUR-OIS) assuming independence of the market rates of from the type of collateralization

# *8.3 Impact of the Interpolation Scheme on the Hedge Efficiency*

Also based on the framework presented here, the impact of the different interpolation schemes has been investigated in [13], where indication was found that among the local interpolation schemes, it is indeed better to use a different interpolation scheme for forward curves than for discount curves. For details we refer to [13].

# **9 Conclusion**

We have presented the re-definition of discount curves and forward curves, which clearly distinguishes the two as different objects (with some relation for the special case of OIS curves). This re-definition results in curves, representing values with welldefined economic interpretations. We then discussed some interpolation schemes for these curves, where our re-definition suggests to apply different interpolation schemes for discount and forward curves. This stands in contrast to the classical approach where a forward curve had been represented via synthetic discount factors, using the same interpolation schemes for both types of curves.

We have presented the calibration, defining the calibration instruments. Based on this, we provide an open source, object-oriented implementation at [11].<sup>13</sup>

Based on this benchmark implementation it was possible to assess the impact of assumptions, which had to be made due to the lack of calibration instruments, e.g., for the case of cross-currency swaps, and the impact of the different interpolation schemes. Indication was found that it is better to use a different interpolation scheme for forward curves than for discount curves. With respect to delta hedges one should

<sup>13</sup>A complete description of the implementation is given at http://www.finmath.net/finmath-lib, including source code and numerical examples. They are left out in this paper.

favor forward interpolation over synthetic discount factor interpolation. Among forward interpolation, linear interpolation performed well with respect to the hedge performance.

**Acknowledgements** The KPMG Center of Excellence in Risk Management is acknowledged for organizing the conference "Challenges in Derivatives Markets - Fixed Income Modeling, Valuation Adjustments, Risk Management, and Regulation".

**Open Access** This chapter is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, duplication, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, a link is provided to the Creative Commons license and any changes made are indicated.

The images or other third party material in this chapter are included in the work's Creative Commons license, unless indicated otherwise in the credit line; if such material is not included in the work's Creative Commons license and the respective action is not permitted by statutory regulation, users will need to obtain permission from the license holder to duplicate, adapt or reproduce the material.

# **References**


# **Impact of Multiple-Curve Dynamics in Credit Valuation Adjustments**

**Giacomo Bormetti, Damiano Brigo, Marco Francischello and Andrea Pallavicini**

**Abstract** We present a detailed analysis of interest rate derivatives valuation under credit risk and collateral modeling. We show how the credit and collateral extended valuation framework in Pallavicini et al. (2011) can be helpful in defining the key market rates underlying the multiple interest rate curves that characterize current interest rate markets. We introduce the collateralized valuation measures and formulate a consistent realistic dynamics for the rates emerging from our analysis. We point out limitations of multiple curve models with deterministic basis considering valuation of particularly sensitive products such as basis swaps.

**Keywords** Multiple curves · Evaluation adjustments · Basis swaps · Collateral · HJM model

# **1 Introduction**

After the onset of the crisis in 2007, all market instruments are quoted by taking into account, more or less implicitly, credit- and collateral-related adjustments. As a consequence, when approaching modeling problems one has to carefully check standard theoretical assumptions which often ignore credit and liquidity issues. One has to go back to market processes and fundamental instruments by limiting oneself to use models based on products and quantities that are available on the market.

G. Bormetti (B)

D. Brigo · M. Francischello Imperial College London, London SW7 2AZ, UK e-mail: damiano.brigo@imperial.ac.uk

M. Francischello e-mail: m.francischello14@imperial.ac.uk

A. Pallavicini Imperial College London and Banca IMI, Largo Mattioli, 3, 20121 Milan, Italy e-mail: a.pallavicini@imperial.ac.uk

University of Bologna, Piazza di Porta San Donato 5, 40126 Bologna, Italy e-mail: giacomo.bormetti@unibo.it

Referring to market observables and processes is the only means we have to validate our theoretical assumptions, so as to drop them if in contrast with observations. This general recipe is what is guiding us in this paper, where we try to adapt interest rate models for valuation to the current landscape.

A detailed analysis of the updated valuation problem one faces when including credit risk and collateral modeling (and further funding costs) has been presented elsewhere in this volume, see for example [6, 7]. We refer to those papers and references therein for a detailed discussion. Here we focus our updated valuation framework to consider the following key points: (i) focus on interest rate derivatives; (ii) understand how the updated valuation framework can be helpful in defining the key market rates underlying the multiple interest rate curves that characterize current interest rate markets; (iii) define collateralized valuation measures; (iv) formulate a consistent realistic dynamics for the rates emerging from the above analysis; (v) show how the framework can be applied to valuation of particularly sensitive products such as basis swaps under credit risk and collateral posting;(vi) point out limitations in some current market practices such as explaining the multiple curves through deterministic fudge factors or shifts where the option embedded in the credit valuation adjustment (CVA) calculation would be priced without any volatility. For an extended version of this paper we remand to [3]. This paper is an extended and refined version of ideas originally appeared in [24].

# **2 Valuation Equation with Credit and Collateral**

Classical interest-rate models were formulated to satisfy no-arbitrage relationships by construction, which allowed one to price and hedge forward-rate agreements in terms of risk-free zero-coupon bonds. Starting from summer 2007, with the spreading of the credit crunch, market quotes of forward rates and zero-coupon bonds began to violate usual no-arbitrage relationships. The main driver of such behavior was the liquidity crisis reducing the credit lines along with the fear of an imminent systemic break-down. As a result the impact of counterparty risk on market prices could not be considered negligible any more.

This is the first of many examples of relationships that broke down with the crisis. Assumptions and approximations stemming from valuation theory should be replaced by strategies implemented with market instruments. For instance, inclusion of CVA for interest-rate instruments, such as those analyzed in [8], breaks the relationship between risk-free zero-coupon bonds and LIBOR forward rates. Also, funding in domestic currency on different time horizons must include counterparty risk adjustments and liquidity issues, see [15], breaking again this relationship. We thus have, against the earlier standard theory,

$$L(T\_0, T\_1) \neq \frac{1}{T\_1 - T\_0} \left(\frac{1}{P\_{T\_0}(T\_1)} - 1\right), \quad F\_t(T\_0, T\_1) \neq \frac{1}{T\_1 - T\_0} \left(\frac{P\_t(T\_0)}{P\_t(T\_1)} - 1\right),\tag{1}$$

where *Pt*(*T*) is a zero-coupon bond price at time *t* for maturity *T*, *L* is the LIBOR rate and *F* is the related LIBOR forward rate. A direct consequence is the impossibility to describe all LIBOR rates in terms of a unique zero-coupon yield curve. Indeed, since 2009 and even earlier, we had evidence that the money market for the Euro area was moving to a multi-curve setting. See [1, 19, 20, 27].

# *2.1 Valuation Framework*

In order to value a financial product (for example a derivative contract), we have to discount all the cash flows occurring after the trading position is entered. We follow the approach of [25, 26] and we specialize it to the case of interest-rate derivatives, where collateralization usually happens on a daily basis, and where gap risk is not large. Hence we prefer to present such results when cash flows are modeled as happening in a continuous time-grid, since this simplifies notation and calculations. We refer to the two names involved in the financial contract and subject to default risk as investor (also called name "I") and counterparty (also called name "C"). We denote by τ*I*, and τ*C*, respectively, the default times of the investor and counterparty. We fix the portfolio time horizon *T* > 0, and fix the risk-neutral valuation model (Ω, *G* , Q), with a filtration (*Gt*)*<sup>t</sup>*∈[0,*T*] such that τ*C*, τ*<sup>I</sup>* are (*Gt*)*<sup>t</sup>*∈[0,*T*]-stopping times. We denote by E*<sup>t</sup>* [ · ] the conditional expectation under Q given *Gt*, and by E<sup>τ</sup>*<sup>i</sup>* [ ·] the conditional expectation under Q given the stopped filtration *G*<sup>τ</sup>*<sup>i</sup>* . We exclude the possibility of simultaneous defaults, and define the first default event between the two parties as the stopping time τ := τ*<sup>C</sup>* ∧ τ*<sup>I</sup>* .

We will also consider the market sub-filtration (*Ft*)*<sup>t</sup>*≥<sup>0</sup> that one obtains implicitly by assuming a separable structure for the complete market filtration (*Gt*)*<sup>t</sup>*≥0. *G<sup>t</sup>* is then generated by the pure default-free market filtration *F<sup>t</sup>* and by the filtration generated by all the relevant default times monitored up to *t* (see for example [2]).

We introduce a risk-free rate *r* associated with the risk-neutral measure. We therefore need to define the related stochastic discount factor *D*(*t*, *u*,*r*) that in general will denote the risk-neutral default-free discount factor, given by the ratio

$$D(t, \mu, r) = B\_t / B\_u \; , \quad dB\_t = r\_t B\_t dt \; ,$$

where *B* is the bank account numeraire, driven by the risk-free instantaneous interest rate *rt* and associated to the risk-neutral measure Q. This rate *rt* is assumed to be (*Ft*)*<sup>t</sup>*∈[0,*T*] adapted and is the key variable in all pre-crisis term structure modeling.

We now want to price a collateralized derivative contract, and in particular we assume that collateral re-hypothecation is allowed, as done in practice (see [4] for a discussion on re-hypothecation). We thus write directly the adjustment payout terms as carry costs cash flows, each accruing at the relevant rate, namely the price *Vt* of a derivative contract, inclusive of collateralized credit and debit risk, margining costs, can be derived by following [25, 26], and is given by:

$$V\_t = \mathbb{E}\left[\int\_t^T D(t, u; r) \left(1\_{\{u < \tau\}} d\pi\_u + 1\_{\{\tau \in du\}} \theta\_u + (r\_u - c\_u) C\_u du\right) \mid \mathcal{G}\_t\right] \tag{2}$$

where


Notice that the above valuation equation (2) is not suited for explicit numerical evaluations, since the right-hand side is still depending on the derivative price via the indicators within the collateral rates and possibly via the close-out term, leading to recursive/nonlinear features. We could resort to numerical solutions, as in [11], but, since our goal is valuing interest-rate derivatives, we prefer to further specialize the valuation equation for such deals.

# *2.2 The Master Equation Under Change of Filtration*

In this first work we develop our analysis without considering a dependence between the default times if not through their spreads, or more precisely by assuming that the default times are *F*-conditionally independent. Moreover, we assume that the collateral account and the close-out processes are *F*-adapted. Thus, we can simplify the valuation equation given by (2) by switching to the default-free market filtration. By following the filtration switching formula in [2], we introduce for any *Gt*-adapted process *Xt* a unique *<sup>F</sup>t*-adapted process *Xt*, defined such that 1{τ>*t*}*Xt* <sup>=</sup> <sup>1</sup>{τ>*t*} *Xt*. Hence, we can write the pre-default price process as given by 1{τ>*t*} *Vt* <sup>=</sup> *Vt* where the right-hand side is given in Eq. (2) and where *V*˜*<sup>t</sup>* is *Ft*-adapted. Before changing filtration, we have to specify the form of the close-out payoff:

$$\theta\_{\mathsf{T}} = \varepsilon\_{\mathsf{T}}(\mathsf{r}, T) - \mathbf{l}\_{\{\mathsf{T}\_{\mathsf{C}} \le \mathsf{T}\_{\mathsf{I}}\}} LGD\_{\mathsf{C}}(\varepsilon\_{\mathsf{T}}(\mathsf{r}, T) - C\_{\mathsf{T}})^{+} - \mathbf{l}\_{\{\mathsf{T}\_{\mathsf{I}} \le \mathsf{T}\_{\mathsf{C}}\}} LGD\_{\mathsf{I}}(\varepsilon\_{\mathsf{T}}(\mathsf{r}, T) - C\_{\mathsf{T}})^{-}$$

<sup>1</sup>The closeout value is the residual value of the contract at default time and the CSA specifies the way it should be computed.

where *LGD* ≤ 1 is the loss given default, (*x*)<sup>+</sup> indicates the positive part of *x* and (*x*)<sup>−</sup> = −(−*x*)+. For an extended discussion of the term θτ we refer to [3]. Moreover, to derive an explicit valuation formula we assume that gap risk is not present, namely *V*˜ <sup>τ</sup><sup>−</sup> = *V*˜ <sup>τ</sup> , and we consider a particular form for collateral and close-out prices, namely we model the close-out value as

$$\varepsilon\_s(t, T) = \mathbb{E}\left[\int\_t^T D(t, u, r) d\pi\_u \mid \mathcal{G}\_s\right], \quad C\_t \doteq \alpha\_t \varepsilon\_t(t, T)$$

with 0 ≤ α*<sup>t</sup>* ≤ 1 and where α*<sup>t</sup>* is *Ft*-adapted. This means that the close-out is the risk-free mark to market at first default time and the collateral is a fraction α*<sup>t</sup>* of the close-out value. An alternative approximation that does not impose a proportionality between the account value processes can be found in [9]. We obtain, by switching to the default-free market filtration *F* the following.2

**Proposition 1** (Master equation under *F*-conditionally independent default times, no gap risk and *F<sup>t</sup>* measurable payout π*t*) *Under the above assumption, Valuation Equation (2) is further specified as Vt* = 1{τ>*t*} *Vt*

$$\begin{aligned} \widetilde{V}\_{t} &= \varepsilon\_{t}(t,T) + \mathbb{E}\left[\int\_{t}^{T} D(t,u;r+\lambda)(r\_{u}-c\_{u})\alpha\_{u}\varepsilon\_{u}(u,T)du \mid \mathcal{F}\_{t}\right] \\ &- \mathbb{E}\left[\int\_{t}^{T} D(t,u;r+\lambda)\lambda\_{u}^{C}(1-\alpha\_{u})LGD\_{C}(\varepsilon\_{u}(u,T))^{+}du \mid \mathcal{F}\_{t}\right] \\ &- \mathbb{E}\left[\int\_{t}^{T} D(t,u;r+\lambda)\lambda\_{u}^{I}(1-\alpha\_{u})LGD\_{I}(\varepsilon\_{u}(u,T))^{-}du \mid \mathcal{F}\_{t}\right] \end{aligned}$$

*where we introduced the pre-default intensity* λ*<sup>I</sup> <sup>t</sup> of the investor and the pre-default intensity* λ*<sup>C</sup> <sup>t</sup> of the counterparty as*

$$\{1\_{\{\mathsf{T}>t\}}\lambda\_t^I \, dt := \mathbb{Q}\left\{\mathsf{t}\_I \in dt \mid \mathsf{t}\_I > t, \,\beta\_I^{\mathsf{T}}\right\}, \, 1\_{\{\mathsf{T}\_C > t\}}\lambda\_t^{\mathsf{C}} \, dt := \mathbb{Q}\left\{\mathsf{t}\_C \in dt \mid \mathsf{t}\_C > t, \,\beta\_I^{\mathsf{T}}\right\}$$

*along with their sum* λ*<sup>t</sup> and the discount factor for any rate xu, namely D*(*t*, *T*, *x*) := exp{− *T <sup>t</sup> xudu*}*.*

# **3 Valuing Collateralized Interest-Rate Derivatives**

As we mentioned in the introduction, we will base our analysis on real market processes. All liquid market quotes on the money market (MM) correspond to instruments with daily collateralization at overnight rate (*et*), both for the investor and the counterparty, namely *ct* . = *et* .

<sup>2</sup>We refer to [3] and [6] for a precise derivation of the proposition.

Notice that the collateral accrual rate is symmetric, so that we no longer have a dependency of the accrual rates on the collateral price, as opposed to the general master equation case. Moreover, we further assume *rt* . = *et* .

This makes sense because *et* being an overnight rate, it embeds a low counterparty risk and can be considered a good proxy for the risk-free rate *rt*. We will describe some of these MM instruments, such as OIS and Interest Rate Swaps (IRS), along with their underlying market rates, in the following sections. For the remaining of this section we adopt the perfect collateralization approximation of Eq. (1) to derive the valuation equations for OIS and IRS products, hence assuming no gap-risk, while in the numeric experiments of Sect. 4 we will consider also uncollateralized deals. Furthermore, we assume that daily collateralization can be considered as a continuous-dividend perfect collateralization. See [4] for a discussion on the impact of discrete-time collateralization on interest-rate derivatives.

# *3.1 Overnight Rates and OIS*

Among other instruments, the MM usually quotes the prices of overnight indexed swaps (OIS). Such contracts exchange a fix-payment leg with a floating leg paying a discretely compounded rate based on the same overnight rate used for their collateralization. Since we are going to price OIS under the assumption of perfect collateralization, namely we are assuming that daily collateralization may be viewed as done on a continuous basis, we approximate also daily compounding in OIS floating leg with continuous compounding, which is reasonable when there is no gap risk. Hence the discounted payoff of a one-period OIS with tenor *x* and maturity *T* is given by

$$D(t, T, e) \left( 1 + \chi K - \exp\left\{ \int\_{T-\chi}^{T} e\_u du \right\} \right)$$

where *K* is the fixed rate payed by the OIS. Furthermore, we can introduce the (par) fix rates *K* = *Et*(*T*, *x*; *e*) that make the one-period OIS contract fair, namely priced 0 at time *t*. They are implicitly defined via

$$\widetilde{V}^{\mathrm{OLS}}\_t(K) := \mathbb{E}\left[\left(1 + \mathbf{x}K - \exp\left\{\int\_{T-x}^T e\_u du\right\}\right) D(t, T; e) \mid \mathcal{P}\_t\right]$$

with *<sup>V</sup>* OIS *<sup>t</sup>* (*Et*(*T*, *x*; *e*)) = 0 leading to

$$E\_l(T, x; e) := \frac{1}{\chi} \left( \frac{P\_l(T - x; e)}{P\_l(T; e)} - 1 \right) \tag{3}$$

where we define collateralized zero-coupon bonds3 as

$$P\_t(T; e) := \mathbb{E}\left[D(t, T; e) \mid \mathcal{J}\_t\right].\tag{4}$$

One-period OIS rates *Et*(*T*, *x*; *e*), along with multi-period ones, are actively traded on the market. Notice that we can bootstrap collateralized zero-coupon bond prices from OIS quotes.

# *3.2 LIBOR Rates, IRS and Basis Swaps*

LIBOR rates (*Lt*(*T*)) used to be linked to the term structure of default-free interlink interest rates in a fundamental way. In the classical term structure theory, LIBOR rates would satisfy fundamental no-arbitrage conditions with respect to zero-coupon bonds that we no longer consider to hold, as we pointed out earlier in (1). We now deal with a new definition of forward LIBOR rates that may take into account collateralization. LIBOR rates are still the indices used as reference rate for many collateralized interest-rate derivatives (IRS, basis swaps, …). IRS contracts swap a fix-payment leg with a floating leg paying simply compounded LIBOR rates. IRS contracts are collateralized at overnight rate *et*. Thus, a discounted one-period IRS payoff with maturity *T* and tenor *x* is given by

$$D(t, T, e) \ge (K - L\_{T - x}(T))^2$$

where *K* is the fix rate payed by the IRS. Furthermore, we can introduce the (par) fix rates *K* = *Ft*(*T*, *x*; *e*) that render the one-period IRS contract fair, i.e. priced at zero. They are implicitly defined via

$$\bar{V}^{\text{IRS}}\_t(K) := \mathbb{E}\left[ \left( \mathbf{x} K - \mathbf{x} L\_{T-x}(T) \right) D(t, T; e) \mid \mathcal{F}\_t \right],$$

with *<sup>V</sup>*IRS *<sup>t</sup>* (*Ft*(*T*, *x*; *e*)) = 0, leading to the following definition of forward LIBOR rate

$$F\_t(T, x; e) := \frac{\mathbb{E}\left[L\_{T-x}(T)D(t, T; e) \mid \mathcal{P}\_t\right]}{\mathbb{E}\left[D(t, T; e) \mid \mathcal{P}\_t\right]} = \frac{\mathbb{E}\left[L\_{T-x}(T)D(t, T; e) \mid \mathcal{P}\_t\right]}{P\_t(T; e)}$$

The above definition may be simplified by a suitable choice of the measure under which we take the expectation. In particular, we can consider the following Radon–Nikodym derivative, defining the collateralized *T*-forward measure Q*<sup>T</sup>*;*<sup>e</sup>* ,

<sup>3</sup>Notice that we are only defining a price process for hypothetical collateralized zero-coupon bond. We are not assuming that collateralized bonds are assets traded on the market.

258 G. Bormetti et al.

$$Z\_t(T;e) := \left. \frac{d\mathbb{Q}^{T;e}}{d\mathbb{Q}} \right|\_{\mathcal{F}\_t} := \frac{\mathbb{E}\left[ |D(0,T;e) \mid \mathcal{F}\_t \right]}{P\_0(T;e)} = \frac{D(0,t;e)P\_t(T;e)}{P\_0(T;e)}$$

which is a positive Q-martingale, normalized so that *Z*0(*T*; *e*) = 1.

Thus, for any payoff φ*<sup>T</sup>* , perfectly collateralized at overnight rate *et*, we can express prices as expectations under the collateralized *T*-forward measure and in particular, we can write LIBOR forward rates as

$$F\_t(T, x; e) := \frac{\mathbb{E}\left[L\_{T-x}(T)D(t, T; e) \mid \mathcal{P}\_t\right]}{\mathbb{E}\left[D(t, T; e) \mid \mathcal{P}\_t\right]} = \mathbb{E}^{T; e}\left[L\_{T-x}(T) \mid \mathcal{P}\_t\right]. \tag{5}$$

One-period forward rates *Ft*(*T*, *x*; *e*), along with multi-period ones (swap rates), are actively traded on the market. Once collateralized zero-coupon bonds are derived, we can bootstrap forward rate curves from such quotes. See, for instance, [1] or [27] for a discussion on bootstrapping algorithms.

Basis swaps are an interesting product that became more popular after the market switched to a multi-curve structure. In fact, in a basis swap there are two floating legs, one pays a LIBOR rate with a certain tenor and the other pays the LIBOR rate with a shorter tenor plus a spread that makes the contract fair at inception. More precisely, the payoff of a basis swap whose legs pay respectively a LIBOR rate with tenors *x* < *y* with maturity *T* = *nx* = *my* is given by

$$\begin{aligned} &\sum\_{i=1}^n D(t, T - (n - i)\mathbf{x}, e) \mathbf{x} (L\_{T - (n - i - 1)\mathbf{x}}(T - (n - i)\mathbf{x}) + K) \\ & - \sum\_{j=1}^m D(t, T - (m - j)\mathbf{y}, e) \mathbf{y} L\_{T - (m - j - 1)\mathbf{y}}(T - (m - j)\mathbf{y}). \end{aligned}$$

It is clear that apart from being traded per se, this instrument is naturally present in the banks portfolios as result of the netting of opposite swap positions with different tenors.

# *3.3 Modeling Constraints*

Our aim is to set up a multiple-curve dynamical model starting from collateralized zero-coupon bonds *Pt*(*T*; *e*), and LIBOR forward rates *Ft*(*T*, *x*; *e*). As we have seen we can bootstrap the initial curves for such quantities from directly observed quotes in the market. Now, we wish to propose a dynamics that preserves the martingale properties satisfied by such quantities. Thus, without loss of generality, we can define collateralized zero-coupon bonds under the Q measure as

$$dP\_t(T;e) = P\_t(T;e) \left( e\_t \, dt - \sigma\_t^{\;P}(T;e)^\* \, dW\_t^e \right)^\*$$

Impact of Multiple-Curve Dynamics … 259

and LIBOR forward rates under the Q*<sup>T</sup>*;*<sup>e</sup>* measure as

$$dF\_t(T, x; e) = \sigma\_t^F(T, x; e)^\* \, dZ\_t^{T; e}$$

where *W<sup>e</sup>* and *Z<sup>T</sup>*;*<sup>e</sup>* are correlated standard (column) vector<sup>4</sup> Brownian motions with correlation matrix ρ, and the volatility vector processes σ *<sup>P</sup>* and σ *<sup>F</sup>* may depend on bonds and forward LIBOR rates themselves.

The following definition of *ft*(*T*, *e*) is not strictly necessary, and we could keep working with bonds *Pt*(*T*; *e*), using their dynamics. However, as it is customary in interest rate theory to model rates rather than bonds, we may try to formulate quantities that are closer to the standard HJM framework. In this sense we can define instantaneous forward rates *ft*(*T*; *e*), by starting from (collateralized) zero-coupon bonds, as given by

$$f\_t(T; e) := -\partial\_T \log P\_t(T; e)$$

We can derive instantaneous forward-rate dynamics by Itô lemma, and we obtain the following dynamics under the Q*T*;*<sup>e</sup>* measure

$$df\_t(T;e) = \sigma\_t(T;e) \, dW\_t^{T;e}, \quad \sigma\_t(T;e) := \partial\_T \, \sigma\_t^P(T;e).$$

where the *WT*;*<sup>e</sup>* s are Brownian motions and partial differentiation is meant to be applied component-wise.

Hence, we can summarize our modeling assumptions in the following way. Since linear products (OIS, IRS, basis swaps…) can be expressed in terms of simpler quantities, namely collateralized zero-coupon bonds *Pt*(*T*; *e*) and LIBOR forward rates *Ft*(*T*, *x*; *e*), we focus on their modeling. The initial term structures for collateralized products may be bootstrapped from market data, and for volatility and dynamics, we can write rates dynamics by enforcing suitable no-arbitrage martingale properties, namely

$$d\underline{f}\_l(T;e) = \sigma\_l(T;e) \cdot dW\_t^{T;e}, \; dF\_l(T,\mathbf{x};e) = \sigma\_t^F(T,\mathbf{x};e) \cdot dZ\_t^{T;e}.\tag{6}$$

As we explained in the introduction, this is where the multiple curve picture finally shows up: we have a curve with LIBOR-based forward rates *Ft*(*T*, *x*; *e*), that are collateral adjusted expectation of LIBOR market rates *LT*−*<sup>x</sup>*(*T*) we take as primitive rates from the market, and we have instantaneous forward rates*ft*(*T*; *e*)that are OIS-based rates. OIS rates *ft*(*T*; *e*) are driven by collateral fees, whereas LIBOR forward rates *Ft*(*T*, *x*; *e*) are driven both by collateral rates and by the primitive LIBOR market rates.

<sup>4</sup>In the following we will consider *<sup>N</sup>*-dimensional vectors as *<sup>N</sup>* <sup>×</sup> 1 matrices. Moreover, given a matrix *A*, we will indicate *A*∗ its transpose, and if *B* is another conformable matrix we indicate *AB* the usual matrix product.

# **4 Interest-Rate Modeling**

We can now specialize our modeling assumptions to define a model for interest-rate derivatives which is on one hand flexible enough to calibrate the quotes of the MM, and on the other hand robust. Our aim is to use an HJM framework using a single family of Markov processes to describe all the term structures and interest rate curves we are interested in.

In the literature many authors proposed generalizations of the HJM framework to include multiple yield curves. In particular, we cite the works of [12–14, 16, 20–23]. A survey of the literature can be found in [17].

In such works the problem is faced in a pragmatic way by considering each forward rate as a single asset without investigating the microscopical dynamics implied by liquidity and credit risks. However, the hypothesis of introducing different underlying assets may lead to over-parametrization issues that affect the calibration procedure. Indeed, the presence of swap and basis-swap quotes on many different yield curves is not sufficient, as the market quotes swaption premia only on few yield curves. For instance, even if the Euro market quotes one-, three-, six- and twelve-month swap contracts, liquidly traded swaptions are only those indexed to the three-month (maturity one-year) and the six-month (maturities from two to thirty years) Euribor rates. Swaptions referring to other Euribor tenors or to overnight rates are not actively quoted.

In order to solve such problem [23] introduces a parsimonious model to describe a multi-curve setting by starting from a limited number of (Markov) processes, so as to extend the logic of the HJM framework to describe with a unique family of Markov processes all the curves we are interested in.

# *4.1 Multiple-Curve Collateralized HJM Framework*

We follow [22, 23] by reformulating their theory under the Q*<sup>T</sup>*;*<sup>e</sup>* measure. We model only observed rates as in market model approaches and we consider a common family of processes for all the yield curves of a given currency, so that we are able to build parsimonious yet flexible models. Hence let us summarize the basic requirements the model must fulfill:


While the first two points are related to the set of financial quantities we are about to model, the last two are conditions we impose on their dynamics, and will be granted by the right choice of model volatilities. Hence, we choose under Q*<sup>T</sup>*;*<sup>e</sup>* measure, the following dynamics:

$$\begin{aligned} df\_l(T; e) &= \sigma\_l(T)^\* dW\_l^{T;e} \\ dF\_l(T, x; e) &= \left( k(T, x) + F\_l(T, x; e) \right) \Sigma\_l(T, x)^\* dW\_l^{T;e} \end{aligned} \tag{7}$$

where we introduce the families of (stochastic *N*-dimensional) volatility processes σ*t*(*T*) and Σ*t*(*T*, *x*), the vector of *N* independent Q*T*;*<sup>e</sup>* -Brownian motions *WT*;*<sup>e</sup> <sup>t</sup>* , and the set of deterministic shifts *k*(*T*, *x*), such that lim*<sup>x</sup>*→<sup>0</sup> *xk*(*T*, *x*) = 1. This limit condition ensures that the model approaches a standard default- and liquidity-free HJM model when the tenor goes to zero. We bootstrap *f*0(*T*; *e*) and *F*0(*T*, *x*; *e*) from market quotes.

In order to get a model with a reduced number of common driving factors in the spirit of HJM approaches, it is sufficient to conveniently tie together the volatility processes σ*t*(*T*) and Σ*t*(*T*, *x*) through a third volatility process σ*t*(*u*, *T*, *x*).

$$\sigma\_t(T) := \sigma\_t(T; T, 0) \,, \quad \Sigma\_t(T, x) := \int\_{T-x}^T \sigma\_t(u; T, x) \, du. \tag{8}$$

Under this parametrization the OIS curve dynamics is the very same as the riskfree curve in an ordinary HJM framework. Indeed, we have for linearly compounding forward rates

$$dE\_t(T, x; e) = (1/\chi + E\_t(T, x; e)) \int\_{T-x}^T \sigma\_t(u)^\* du \, dW\_t^{T; e}.$$

In the generalized version of the HJM framework proposed by [23] we have an explicit expression for both the collateralized zero-coupon bonds *Pt*(*T*; *e*) and the LIBOR forward rates *Ft*(*T*, *x*; *e*). The first result is a direct consequence of modeling the OIS curve as the risk-free curve in a standard HJM framework, while the second result can be achieved only if a particular form of the volatilities is selected. We obtain this if we generalize the approach of [28] by introducing the following separability

$$\begin{aligned} \text{constraint} \\ \sigma\_t(u, T, x) &:= h(t)q(u, T, x)g(t, u), \\ \text{g}(t, u) &:= \exp\left\{-\int\_t^u a(s)ds\right\}, \quad q(u; u, 0) := ld, \end{aligned} \tag{9}$$

where *ht* is an *N* × *N* matrix process, *q*(*u*, *T*, *x*) is a deterministic *N* × *N* diagonal matrix function, and *a*(*s*) is a deterministic *N*-dimensional vector function. The condition on *q*(*u*; *T*, *x*) being the identity matrix, when *T* = *u* ensures that a standard HJM framework holds for collateralized zero-coupon bonds.

We can work out an explicit expression for the LIBOR forward rates, by plugging the expression of the volatilities into Eq. (7). We obtain

$$\begin{aligned} &\log\left(\frac{k(T,\mathbf{x}) + F\_t(T,\mathbf{x};e)}{k(T,\mathbf{x}) + F\_0(T,\mathbf{x};e)}\right) \\ & \qquad = G(t,T-\mathbf{x},T;T,\mathbf{x})^\* \left(X\_t + Y\_t\left(G\_0(t,t,T) - \frac{1}{2}G(t,T-\mathbf{x},T;T,\mathbf{x})\right)\right), \end{aligned} \tag{10}$$

where the stochastic vector process *Xt* and the auxiliary matrix process *Yt* are defined under the Q measure as in the ordinary HJM framework

$$\begin{aligned} X\_t^i &= \sum\_{k=1}^N \int\_0^t g\_i(\mathbf{s}, t) \left( h\_{ik,s} dW\_{k,s} + (h\_s^\* h\_s)\_{ik} \int\_s^t d\mathbf{y} g\_k(\mathbf{s}, \mathbf{y}) \, d\mathbf{s} \right) \quad i, i = 1 \dots N \\\ Y\_t^{ik} &= \int\_0^t g\_i(\mathbf{s}, t) (h\_s^\* h\_s)\_{ik} g\_k(\mathbf{s}, t) \, d\mathbf{s} \quad i, k = 1 \dots N \end{aligned}$$

and

$$G\_0(t, T\_0, T\_1) = \int\_{T\_0}^{T\_1} \mathbf{g}(t, \mathbf{s}) d\mathbf{s}, \quad G(t, T\_0, T\_1, T, \mathbf{x}) = \int\_{T\_0}^{T\_1} \mathbf{q}(\mathbf{s}, T, \mathbf{x}) \mathbf{g}(t, \mathbf{s}) d\mathbf{s}.$$

It is worth noting that the integral representation of forward LIBOR volatilities given by Eq. (8), together with the common separability constraint given in Eq. (9) are sufficient conditions to ensure the existence of a reconstruction formula for all OIS and LIBOR forward rates based on the very same family of Markov processes (see [3]).

We are interested in some specification of this model, in particular a variant of the Hull and White model (HW), a variant of the Cheyette model (Ch) and the Moreni and Pallavicini model (MP). The HW model [18] is the simplest one, and is obtained choosing

$$h(t) \doteq R \,, \quad q(u, T, x) \doteq \stackrel{\cdot}{=} Id \,\,, \quad a(s) \doteq a \,, \quad \kappa(T, x) \doteq \frac{1}{x} \,\, \Big| \tag{11}$$

where *a* is a constant vector, and *R* is the Cholesky decomposition of the correlation matrix that we want our *Xt* vector to have. In this case we obtain σ*t*(*u*; *T*, *x*) = *R* · *e*−*a*(*u*−*t*) , where the exponential is intended to be component-wise. Then we note that *Xt* is a mean reverting Gaussian process while the *Yt* process is deterministic.

In order to model implied volatility smiles, we can add a stochastic volatility process to our model, as shown in [22]. In particular we can obtain a variant of the Ch model ([10]), considering a common square-root process for all the entries of *h*, as in [29]. More precisely we replace *<sup>h</sup>*(*t*) in (11) with *<sup>h</sup>*(*t*) . <sup>=</sup> <sup>√</sup>*vtR*. With *<sup>a</sup>* and *<sup>R</sup>* as before and *vt* being a process with the following dynamic:

Impact of Multiple-Curve Dynamics … 263

$$d\upsilon\_t = \eta \left(1 - \upsilon\_t\right) dt + \upsilon\_0 \left(1 + (\upsilon\_1 - 1)e^{-\upsilon\_2 t}\right) \sqrt{\upsilon\_t} dZ\_t \quad , \quad \upsilon\_0 = \tilde{\upsilon} \tag{12}$$

where *Zt* is a Brownian motion correlated to *Wt*. Obtaining as a volatility process <sup>σ</sup>*t*(*u*; *<sup>T</sup>*, *<sup>x</sup>*) <sup>=</sup> <sup>√</sup>*vtR* · *<sup>e</sup>*−*a*(*u*−*t*) .

As the last specification of the framework we consider the MP model which uses a different shift *k*(*T*, *x*), and introduces a dependence on the tenor in the volatility process.

$$h(t) \doteq \sqrt{\mathbf{v}\_l} \mathbf{R} \,, \quad q(\mathbf{u}, T, \mathbf{x})^{i,i} \doteq e^{x\eta^i} \,, \quad a(\mathbf{s}) \doteq a \,, \quad \kappa(T, \mathbf{x}) \doteq \frac{e^{-\chi\mathbf{x}}}{\chi} \,\tag{13}$$

With *a* and *R* as before and *vt* being defined by (12). Here we have for the volatility <sup>σ</sup>*t*(*u*; *<sup>T</sup>*, *<sup>x</sup>*) <sup>=</sup> <sup>√</sup>*vtR* · *<sup>e</sup>*<sup>η</sup>*x*−*a*(*u*−*t*) .

To better appreciate the difference between the Ch model and the MP model one could compute the quantity

$$\beta\_l(\mathbf{x}\_1, \mathbf{x}\_2; e) := \frac{1}{\varkappa\_2} \log \left( \frac{\frac{1}{\varkappa\_2} + E\_l(t + \varkappa\_2, \varkappa\_2; e)}{\frac{1}{\varkappa\_2} + F\_l(t + \varkappa\_2, \varkappa\_2; e)} \right) - \frac{1}{\varkappa\_1} \log \left( \frac{\frac{1}{\varkappa\_1} + E\_l(t + \varkappa\_1, \varkappa\_1; e)}{\frac{1}{\varkappa\_1} + F\_l(t + \varkappa\_1, \varkappa\_1; e)} \right)$$

which represents the time-normalized difference between two forward rates with different tenors and thus can be used as a proxy for the value of a basis swap. We have that in the HW and in the Ch models β*t*(*x*1, *x*2; *e*) is deterministic while in the MP model is a stochastic quantity. This suggests that the MP model should be able to better capture the dynamics of the basis between two rates with different tenors. We refer the reader to [3] for a more detailed analysis of the issue, and to [23] for calibration and valuation examples for the swaptions and cap/floor market.

# *4.2 Numerical Results*

We apply our framework to simple but relevant products: an IRS and a basis swap. We analyze the impact of the choice of an interest rate model on the portfolio valuation, in particular we measure the dependency of the price on the correlations between interest-rates and credit spreads, the so-called wrong-way risk. We model the market risks by simulating the following processes in a multiple-curve HJM model under the pricing measure Q. The overnight rate *et* and the LIBOR forward rates *Ft*(*T*; *e*) are simulated according to the dynamics given in Sect. 4.1. Maintaining the same notation of the aforementioned section, we choose *N* = 2, and for our numerical experiments we use a HW model, a Ch model and an MP model, all calibrated to swaption at-the-money volatilities listed on the European market.

As we have already noted, the Ch model introduces a stochastic volatility and hence has an increased number of parameters with respect to the HW model. The MP model aims at better modeling the basis between rates with different tenors, while keeping the model parsimonious in terms of extra parameters with respect to the Ch model. In particular the HW model is able to reproduce the ATM quotes but is not able to correctly reproduce the volatility smile. On the other hand, the introduction of a stochastic volatility process helps in recovering the market data smile and thus the Ch and the MP models have similar results in properly fitting the smile. The detailed results of the calibration are available in [3].

For what concerns the credit part, the default intensities of the investor and the counterparty are given by two CIR++ processes λ*<sup>i</sup> <sup>t</sup>* = *y<sup>i</sup> <sup>t</sup>* + ψ*<sup>i</sup>* (*t*) under the Q*<sup>T</sup>*;*<sup>e</sup>* measure, i.e. they follow

$$d\mathbf{y}\_t^i = \mathbf{y}^i(\boldsymbol{\mu}^i - \mathbf{y}\_t^i)\,dt + \boldsymbol{\xi}^i \sqrt{\mathbf{y}\_t^i} \,d\mathbf{Z}\_t^i \quad , \quad i \in \{I, C\}$$

where the two *Z<sup>i</sup>* s are Brownian motions correlated with the *WT*;*<sup>e</sup>* s, and they are calibrated to the market data shown in [4]. In particular, two different market settings are used in the numerical examples: the medium risk and the high risk settings. The correlations among the risky factors are induced by correlating the Brownian motions as in [8].

We now analyze the impact of wrong-way risk on the bilateral adjustment, namely CVA plus DVA, of IRS and basis swaps when collateralization is switched off, namely we want to evaluate Eq. (1) when α*<sup>t</sup>* . = 0. For an extended analysis see [3]. Wrongway risk is expressed with respect to the correlation between the default intensities and a proxy of market risk, namely the short rate *et*.

In Fig. 1 we show the variation of the bilateral adjustment for a ten years IRS receiving a fix rate yearly and paying 6 m Libor twice a year and for a ten years basis swap receiving 3 m Libor plus spread and paying 6 m Libor. It is clear that for a product like the IRS, not subject to the basis dynamic, we have that the big difference among the models is the presence of a stochastic volatility. In fact we can see that the Ch model and the MP model are almost indistinguishable while the results of the HW model are different from the stochastic volatility ones. Moreover we can

**Fig. 1** Wrong-way risk for different models. On the horizontal axis correlation among credit and market risks; on the vertical axis the bilateral adjustment, namely CVA + DVA, in basis points. *Left panel* a 10y IRS receiving a fix rate and paying 6m Libor. *Right panel* a 10y basis swap receiving 3m Libor plus spread and paying 6m Libor. Montecarlo error is displayed where significant

observe that all the models have the same trend, i.e. the bilateral adjustment grows as correlation increase. In fact this can be explained by the fact that a higher correlation means that the deal will be more profitable when it will be more risky (since we are receiving the fixed rate and paying the floating one), hence the bilateral adjustment will be bigger.

In the case of a basis swap instead, we see that, as said before, the HW model and the Ch model do not have a basis dynamic and hence the curve represented is almost flat. On the other hand the MP model is able to capture the dynamics of the basis and hence we can see that the more the overnight rate is correlated with the credit risk the smaller the bilateral adjustment becomes.

We conclude by pointing out that our analysis will be extended to partially collateralized deals in future work. In such a context funding costs enter the picture in a more comprehensive way. Some initial suggestions in this respect were given in [24].

**Acknowledgements** The KPMG Center of Excellence in Risk Management is acknowledged for organizing the conference "Challenges in Derivatives Markets - Fixed Income Modeling, Valuation Adjustments, Risk Management, and Regulation".

**Open Access** This chapter is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, duplication, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, a link is provided to the Creative Commons license and any changes made are indicated.

The images or other third party material in this chapter are included in the work's Creative Commons license, unless indicated otherwise in the credit line; if such material is not included in the work's Creative Commons license and the respective action is not permitted by statutory regulation, users will need to obtain permission from the license holder to duplicate, adapt or reproduce the material.

# **References**


# **A Generalized Intensity-Based Framework for Single-Name Credit Risk**

**Frank Gehmlich and Thorsten Schmidt**

**Abstract** The intensity of a default time is obtained by assuming that the default indicator process has an absolutely continuous compensator. Here we drop the assumption of absolute continuity with respect to the Lebesgue measure and only assume that the compensator is absolutely continuous with respect to a general σ-finite measure. This allows for example to incorporate the Merton-model in the generalized intensity-based framework. We propose a class of generalized Merton models and study absence of arbitrage by a suitable modification of the forward rate approach of Heath–Jarrow–Morton (1992). Finally, we study affine term structure models which fit in this class. They exhibit stochastic discontinuities in contrast to the affine models previously studied in the literature.

**Keywords** Credit risk · HJM · Forward-rate · Structural approach · Reduced-form approach · Stochastic discontinuities

# **1 Introduction**

The two most common approaches to credit risk modeling are the *structural* approach, pioneered in the seminal work of Merton [23], and the *reduced-form* approach which can be traced back to early works of Jarrow, Lando, and Turnbull [18, 22] and to [1].

Default of a company happens when the company is not able to meet its obligations. In many cases the debt structure of a company is known to the public, such that default happens with positive probability at times which are known a priori. This, however, is excluded in the intensity-based framework and it is the purpose of this article to put forward a generalization which allows to incorporate such effects. Examples in the literature are, e.g., structural models like [13, 14, 23]. The recently

F. Gehmlich (B) · T. Schmidt

Department of Mathematics, University of Freiburg, Eckerstr 1, 79106 Freiburg, Germany e-mail: gehmlichfrank@gmail.com

T. Schmidt

e-mail: thorsten.schmidt@stochastik.uni-freiburg.de

<sup>©</sup> The Author(s) 2016

K. Glau et al. (eds.), *Innovations in Derivatives Markets*, Springer Proceedings in Mathematics & Statistics 165, DOI 10.1007/978-3-319-33446-2\_13

missed coupon payment by Argentina is an example for such a credit event as well as the default of Greece on the 1st of July 2015.1

It is a remarkable observation of [2] that it is possible to extend the reduced-form approach beyond the class of intensity-based models. The authors study a class of first-passage time models under a filtration generated by a Brownian motion and show its use for pricing and modeling credit risky bonds. Our goal is to start with even weaker assumptions on the default time and to allow for jumps in the compensator of the default time at deterministic times. From this general viewpoint it turns out, surprisingly, that previously used HJM approaches lead to arbitrage: the whole term structure is absolutely continuous and cannot compensate for points in time bearing a positive default probability. We propose a suitable extension with an additional term allowing for discontinuities in the term structure at certain random times and derive precise drift conditions for an appropriate no-arbitrage condition. The related article [12] only allows for the special case of finitely many risky times, an assumption which is dropped in this article.

The structure of this article is as follows: in Sect. 2, we introduce the general setting and study drift conditions in an extended HJM-framework which guarantee absence of arbitrage in the bond market. In Sect. 3 we study a class of affine models which are stochastically discontinuous. Section 4 concludes.

# **2 A General Account on Credit Risky Bond Markets**

Consider a filtered probability space (Ω, *A* , G, *P*) with a filtration G = (*Gt*)*<sup>t</sup>*≥<sup>0</sup> (the *general* filtration) satisfying the usual conditions, i.e. it is right-continuous and *G*<sup>0</sup> contains the *P*-null sets *N*<sup>0</sup> of *A* . Throughout, the probability measure *P* denotes the objective measure. As we use tools from stochastic analysis, all appearing filtrations shall satisfy the usual conditions. We follow the notation from [17] and refer to this work for details on stochastic processes which are not laid out here.

The filtration G contains all available information in the market. The default of a company is public information and we therefore assume that the default time τ is a G-stopping time. We denote the *default indicator process H* by

$$H\_t = \mathbf{1}\_{\{t \ge \tau\}}, \quad t \ge 0,$$

such that *Ht* = 1τ ,∞-(*t*) is a right-continuous, increasing process. We will also make use of the *survival process* 1 − *H* = 1-0,τ-. The following remark recalls the essentials of the well-known intensity-based approach.

<sup>1</sup>Argentina's missed coupon payment on \$29 billion debt was voted a credit event by the International Swaps and Derivatives Association, see the announcements in [16, 24]. Regarding the failure of 1.5 Billion EUR of Greece on a scheduled debt repayment to the International Monetary fund, see e.g. [9].

*Remark 1* (*The intensity-based approach*) The intensity-based approach consists of two steps: first, denote by H = (*Ht*)*<sup>t</sup>*≥<sup>0</sup> the filtration generated by the default indicator, *H<sup>t</sup>* = σ(*Hs* : 0 ≤ *s* ≤ *t*) ∨ *N*0, and assume that there exists a sub-filtration F of G, i.e. *F<sup>t</sup>* ⊂ *G<sup>t</sup>* holds for all *t* ≥ 0 such that

$$
\mathcal{H}\_t = \mathcal{P}\_t \vee \mathcal{H}\_t, \quad t \ge 0. \tag{1}
$$

Viewed from this perspective, G is obtained from the default information H by a *progressive enlargement*<sup>2</sup> with the filtration F. This assumption opens the area for the largely developed field of enlargements of filtration with a lot of powerful and quite general results.

Second, the following key assumption specifies the default intensity: assume that there is an F-progressive process λ, such that

$$P(\tau > t | \mathcal{J}\_t) = \exp\left(-\int\_0^t \lambda\_s ds\right), \quad t \ge 0. \tag{2}$$

It is immediate that the inclusion*F<sup>t</sup>* ⊂ *G<sup>t</sup>* is strict under existence of an intensity, i.e. τ is not an F-stopping time. Arbitrage-free pricing can be achieved via the following result: Let *Y* be a non-negative random variable. Then, for all *t* ≥ 0,

$$E[1\_{\{\tau>t\}}Y|\mathcal{G}\_t] = 1\_{\{\tau>t\}}e^{\int\_0^t \lambda\_s ds}E[1\_{\{\tau>t\}}Y|\mathcal{F}\_t].$$

Of course, this result holds also when a pricing measure *Q* is used instead of *P*. For further literature and details we refer for example to [11], Chap. 12, and to [3].

# *2.1 The Generalized Intensity-Based Framework*

The default indicator process *H* is a bounded, cádlág, and increasing process, hence a submartingale of class (D), that is, the family (*XT* ) over all stopping times *T* is uniformly integrable. By the Doob–Meyer decomposition,<sup>3</sup> the process

$$M\_t = H\_t - \Lambda\_t, \quad t \ge 0 \tag{3}$$

is a true martingale where Λ denotes the dual F-predictable projection, also called compensator, of *H*. As 1 is an absorbing state, Λ*<sup>t</sup>* = Λ*<sup>t</sup>*∧<sup>τ</sup> . To keep the arising technical difficulties at a minimum, we assume that there is an increasing process *A* such that

<sup>2</sup>Note that here G is right-continuous and *P*-complete by assumption which is a priori not guaranteed by (1). One can, however, use the right-continuous extension and we refer to [15] for a precise treatment and for a guide to the related literature.

<sup>3</sup>See [20], Theorem 1.4.10.

$$A\_t = \int\_0^{t\wedge\tau} \lambda\_s dA(s), \quad t \ge 0,\tag{4}$$

with a non-negative and predictable process λ. The process λ is called *generalized intensity* and we refer to Chap. VIII.4 of [5] for a more detailed treatment of generalized intensities (or, equivalently, dual predictable projections) in the context of point processes.

Note that withΔ*M* ≤ 1 we have thatΔΛ = λ*s*Δ*A*(*s*) ≤ 1.Whenever λ*s*Δ*A*(*s*) > 0, there is a positive probability that the company defaults at time *s*. We call such times *risky times*, i.e. predictable times having a positive probability of a default occurring right at that time. Note that under our assumption (4), all risky times are deterministic. The relationship between ΔΛ(*s*) and the default probability at time *s* will be clarified in Example 3.

# *2.2 An Extension of the HJM Approach*

A credit risky bond with maturity *T* is a contingent claim promising to pay one unit of currency at *T* . The price of the bond with maturity *T* at time *t* ≤ *T* is denoted by *P*(*t*, *T* ). If no default occurred prior to or at *T* we have that *P*(*T*, *T* ) = 1. We will consider zero recovery, i.e. the bond loses its total value at default, such that *P*(*t*, *T* ) = 0 on {*t* ≥ τ }. The family of stochastic processes{(*P*(*t*, *T* )<sup>0</sup>≤*t*≤*<sup>T</sup>* ), *T* ≥ 0} describes the evolution of the *term structure T* → *P*(., *T* ) over time.

Besides the bonds there is a *numéraire X*0, which is a strictly positive, adapted process. We make the weak assumption that log *X*<sup>0</sup> is absolutely continuous, i.e. *X*<sup>0</sup> *<sup>t</sup>* = exp( *t* <sup>0</sup> *rsds*) with a progressively measurable process *r*, called the short rate. For practical applications one would use the overnight index swap (OIS) rate for constructing such a numéraire.

The aim of the following is to extend the HJM approach in an appropriate way to the generalized intensity-based framework in order to obtain arbitrage-free bond prices. First approaches in this direction were [7, 25] and a rich source of literature is again [3]. Absence of arbitrage in such an infinite dimensional market can be described in terms of no asymptotic free lunch (NAFL) or the more economically meaningful no asymptotic free lunch with vanishing risk, see [6, 21].

Consider a pricing measure *Q*<sup>∗</sup> ∼ *P*. Our intention is to find conditions which render *Q*<sup>∗</sup> an equivalent local martingale measure. In the following, only occasionally the measure *P* will be used, such that from now on, all appearing terms (like martingales, almost sure properties, etc.) are to be considered with respect to *Q*∗.

To ensure that the subsequent analysis is meaningful, we make the following technical assumption.

**Assumption 2.1** The generalized default intensity λ is non-negative, predictable, and *A*-integrable on [0, *T* ∗]:

A Generalized Intensity-Based Framework for Single-Name … 271

$$\int\_0^{T^\*} \lambda\_s dA(s) < \infty, \quad \mathcal{Q}^\*\text{-a.s.}$$

Moreover, *A* has vanishing singular part, i.e.

$$A(t) = t + \sum\_{0 < r \le t} \Delta A(s). \tag{5}$$

The representation (5) of *A* is without loss of generality: indeed, if the continuous part *<sup>A</sup><sup>c</sup>* is absolutely continuous, i.e. *<sup>A</sup><sup>c</sup>*(*t*) <sup>=</sup> *<sup>t</sup>* <sup>0</sup> *a*(*s*)*ds*, replacing λ*<sup>s</sup>* by λ*sa*(*s*) gives the compensator of *H* with respect to *A*˜ whose continuous part is *t*.

Next, we aim at building an arbitrage-free framework for bond prices. In the generalized intensity-based framework, the (HJM) approach does allow for arbitrage opportunities at risky times. We therefore consider the following generalization: consider a σ-finite (deterministic) measure ν. We could be general on ν, allowing for an absolutely continuous, a singular continuous, and a pure-jump part. However, for simplicity, we leave the singular continuous part aside and assume that

$$\nu = \nu^{ac} + \nu^d$$

where ν*ac*(*ds*) = *ds* and ν*<sup>d</sup>* distributes mass only to points, i.e. ν*<sup>d</sup>* (*A*) = *<sup>i</sup>*≥<sup>1</sup> *wi* δ*ui*(*A*), for 0 < *u*<sup>1</sup> < *u*<sup>2</sup> < ··· and positive weights *wi* > 0, *i* ≥ 1; here δ*<sup>u</sup>* denotes the Dirac measure at *u*. Moreover, we assume that defaultable bond prices are given by

$$\begin{split} P(t,T) &= \mathbb{I}\_{\{\tau > t\}} \exp\left(-\int\_{t}^{T} f(t,u)\nu(du)\right) \\ &= \mathbb{I}\_{\{\tau > t\}} \exp\left(-\int\_{t}^{T} f(t,u) du - \sum\_{l\ge 1} \mathbb{I}\_{\{u\_{l} \in [t,T]\}} w\_{l} f(t,u\_{l})\right), \quad 0 \le t \le T \le T^{\*}. \end{split} \tag{6}$$

The sum in the last line gives the extension over the (HJM) approach which allows us to deal with risky times in an arbitrage-free way.

The family of processes ( *f* (*t*, *T* ))<sup>0</sup>≤*t*≤*<sup>T</sup>* for *T* ∈ [0, *T* ∗] are assumed to be Itô processes satisfying

$$f(t, T) = f(0, T) + \int\_0^t a(s, T)ds + \int\_0^t b(s, T) \cdot dW\_s \tag{7}$$

with an *n*-dimensional *Q*∗-Brownian motion *W*.

Denote by *B* the Borel σ-field over R.

**Assumption 2.2** We require the following technical assumptions:

(i) the initial forward curve is measurable, and integrable on [0, *T* ∗]:

$$\int\_0^{T^\*} |f(0, u)| < \infty, \quad \mathcal{Q}^\*\text{-a.s.}$$

(ii) the *drift parameter a*(ω,*s*, *t*) is R-valued *O* ⊗ *B*-measurable and integrable on [0, *T* ∗]:

$$\int\_0^{T^\*} \int\_0^{T^\*} |a(\mathbf{s}, \boldsymbol{\mu})| d\mathbf{s} \,\nu(d\boldsymbol{u}) < \infty, \quad \mathcal{Q}^\*\text{-a.s.},$$

(iii) the *volatility parameter b*(ω,*s*, *t*) is R*<sup>n</sup>*-valued, *O* ⊗ *B*-measurable, and

$$\sup\_{s,t \le T^\*} \parallel b(s,t) \parallel < \infty, \quad \mathcal{Q}^\*\text{-a.s.}$$

(iv) it holds that

$$0 \le \lambda(\mu\_i) \Delta A(\mu\_i) < w\_i, \quad i \ge 1.$$

Set

$$\begin{aligned} \bar{a}(t,T) &= \int\_{t}^{T} a(t,u)\nu(du), \\ \bar{b}(t,T) &= \int\_{t}^{T} b(t,u)\nu(du), \\ H'(t) &= \int\_{0}^{t} \lambda\_{i}ds - \sum\_{u\_{i}\leq t} \log\left(\frac{w\_{i}-\lambda\_{u\_{i}}\Delta A(u\_{i})}{w\_{i}}\right). \end{aligned} \tag{8}$$

The following proposition gives the desired drift condition in the generalized Merton models.

**Theorem 1** *Assume that Assumptions 2.1 and 2.2 hold. Then Q*<sup>∗</sup> *is an ELMM if and only if the following conditions hold:* {*s* : Δ*A*(*s*) = 0}⊂{*u*1, *u*2,...}*, and*

$$\int\_0^t f(\mathbf{s}, \mathbf{s}) \nu(d\mathbf{s}) = \int\_0^t r\_s d\mathbf{s} + H'(t),\tag{9}$$

$$
\bar{a}(t, T) = \frac{1}{2} \parallel \bar{b}(t, T) \parallel^2,\tag{10}
$$

*for* 0 ≤ *t* ≤ *T* ≤ *T* <sup>∗</sup> *d Q*<sup>∗</sup> ⊗ *dt-almost surely on* {*t* < τ }*.*

The first condition, (9), can be split in the continuous and pure-jump part, such that (9) is equivalent to

A Generalized Intensity-Based Framework for Single-Name … 273

$$\begin{aligned} f(t, t) &= r\_s + \lambda\_s \\ f(t, u\_i) &= \log \frac{w\_i}{w\_i - \lambda(u\_i)\Delta A(u\_i)} \ge 0. \end{aligned}$$

The second relation states explicitly the connection of the forward rate at a risky time *ui* to the probability *Q*∗(τ = *ui*|*Fui*−), given that τ ≥ *ui* , of course. It simplifies moreover, if Δ*A*(*ui*) = *wi* to

$$f(t, \mu\_i) = -\log(1 - \lambda(\mu\_i)).\tag{11}$$

For the proof we first provide the canonical decomposition of

$$J(t,T) := \int\_t^T f(t,u)\nu(du), \quad 0 \le t \le T.$$

**Lemma 1** *Assume that Assumption 2.2 holds. Then, for each T* ∈ [0, *T* ∗]*the process* (*J* (*t*, *T* ))<sup>0</sup>≤*t*≤*<sup>T</sup> is a special semimartingale and*

$$J(t,T) = \int\_0^T f(0,u)\nu(du) + \int\_0^t \tilde{a}(u,T)du + \int\_0^t \tilde{b}(u,T)dW\_{\mathcal{U}} - \int\_0^t f(u,u)\nu(du).$$

*Proof* Using the stochastic Fubini Theorem (as in [26]), we obtain

$$\begin{split} J(t,T) &= \int\_{t}^{T} \left( f(0,u) + \int\_{0}^{t} a(s,u)ds + \int\_{0}^{t} b(s,u)dW\_{s} \right) \nu(du) \\ &= \int\_{0}^{T} f(0,u)\nu(du) + \int\_{0}^{t} \int\_{s}^{T} a(s,u)\nu(du)ds + \int\_{0}^{t} \int\_{s}^{T} b(s,u)\nu(du)dW\_{s} \\ &\quad - \int\_{0}^{t} f(0,u)\nu(du) - \int\_{0}^{t} \int\_{s}^{t} a(s,u)\nu(du)ds - \int\_{0}^{t} \int\_{s}^{t} b(s,u)\nu(du)dW\_{s} \\ &= \int\_{0}^{T} f(0,u)\nu(du) + \int\_{0}^{t} \bar{a}(s,T)ds + \int\_{0}^{t} \bar{b}(s,T)dW\_{s} \\ &\quad - \int\_{0}^{t} \left( f(0,u) - \int\_{0}^{u} a(s,u)ds - \int\_{0}^{u} b(s,u)dW\_{s} \right)\nu(du), \end{split}$$

and the claim follows.

*Proof* (*Proof of Theorem* 1) Set, *<sup>E</sup>*(*t*) <sup>=</sup> <sup>1</sup>{τ>*t*}, and *<sup>F</sup>*(*t*, *<sup>T</sup>* ) <sup>=</sup> exp - <sup>−</sup> *<sup>T</sup> <sup>t</sup> f* (*t*, *u*) ν(*du*) , such that *P*(*t*, *T* ) = *E*(*t*)*F*(*t*, *T* ). Integration by parts yields that

$$dP(t,T) = F(t-,T)dE(t) + E(t-)dF(t,T) + d[E,F(.,T)]\_I =: (1') + (2') + (3').\tag{12}$$

In view of (1 ), we obtain from (4), that

$$E(t) + \int\_0^{t \wedge \tau} \lambda\_s dA(\mathbf{s}) =: M\_t^1 \tag{13}$$

is a martingale. Regarding (2 ), note that from Lemma 1 we obtain by Itô's formula that

$$\begin{split} \frac{dF(t,T)}{F(t-,T)} &= \left( f(t,t) - \bar{a}(t,T) + \frac{1}{2} \parallel \bar{b}(t,T) \parallel^2 \right) dt \\ &+ \sum\_{i \ge 0} \left( e^{f(t,t)} - 1 \right) w\_i \delta\_{u\_i}(dt) + dM\_t^2, \end{split} \tag{14}$$

with a local martingale *M*2. For the remaining term (3 ), note that

$$\sum\_{0
$$=\int\_0^t F(s-,T)(e^{f(s,s)}-1)\nu(\{s\})dM\_s^1$$

$$-\int\_0^{t\wedge\tau} F(s-,T)(e^{f(s,s)}-1)\nu(\{s\})\lambda\_s dA(s). \tag{15}$$
$$

Inserting (14) and (15) into (12) we obtain

$$\begin{aligned} \frac{dP(t,T)}{P(t-,T)} &= -\lambda\_t dA(t) \\ &+ \left(f(t,t) - \bar{a}(t,T) + \frac{1}{2} \parallel \bar{b}(t,T) \parallel^2\right) dt \\ &+ \sum\_{i \ge 0} \left(e^{f(t,t)} - 1\right) w\_i \delta\_{u\_i}(dt) \\ &- \int\_{\mathbb{R}} \nu(\{t\}) (e^{f(t,t)} - 1) \lambda\_t dA(t) + dM\_t^3 \end{aligned}$$

with a local martingale *M*3. We obtain a *Q*∗-local martingale if and only if the drift vanishes. Next, we can separate between absolutely continuous and discrete part. The absolutely continuous part yields (10) and *f* (*t*, *t*) = *rt* + λ*<sup>t</sup> d Q*<sup>∗</sup> ⊗ *dt*-almost surely. It remains to compute the discontinuous part, which is given by

$$\sum\_{i:u\_i \le t} P(u\_i - , T)(e^{f(u\_i, u\_i)} - 1)w\_i - \sum\_{0 < s \le t} P(s - , T)e^{f(s, s)}\lambda\_s \Delta A(s),$$

for 0 ≤ *t* ≤ *T* ≤ *T* <sup>∗</sup>. This yields {*s* : Δ*A*(*s*) = 0}⊂{*u*1, *u*2,...}. The discontinuous part vanishes if and only if

$$1\_{\{u\_i \le T^\* \wedge \tau\}} e^{-f(u\_i, u\_i)} w\_i = 1\_{\{u\_i \le T^\* \wedge \tau\}} \left(w\_i - \lambda\_{u\_i} \Delta A(u\_i)\right), \quad i \ge 1, \tau$$

which is equivalent to

$$1\_{\{u\_i \le T^\* \wedge \tau\}} f(u\_i, u\_i) = -1\_{\{u\_i \le T^\* \wedge \tau\}} \log \frac{w\_i - \lambda\_{u\_i} \Delta A(u\_i)}{w\_i}, \quad i \ge 1.$$

We obtain (9) and the claim follows.

*Example 1* (*The Merton model*) The paper [23] considers a simple capital structure of a firm, consisting only of equity and a zero-coupon bond with maturity *U* > 0. The firm defaults at *U* if the total market value of its assets is not sufficient to cover the liabilities.

We are interested in setting up an arbitrage-free market for credit derivatives and consider a market of defaultable bonds *P*(*t*, *T* ), 0 ≤ *t* ≤ *T* ≤ *T* <sup>∗</sup> with 0 < *U* ≤ *T* <sup>∗</sup> as basis for more complex derivatives. In a stylized form the Merton model can be represented by a Brownian motion *W* denoting the normalized logarithm of the firm's assets, a constant *K* > 0 and the default time

$$
\tau = \begin{cases} U & \text{if } W\_U \le K \\ \infty & \text{otherwise.} \end{cases}
$$

Assume for simplicity a constant interest rate *r* and let F be the filtration generated by *W*. Then *P*(*t*, *T* ) = *e*−*r*(*T*−*t*) whenever *T* < *U* because these bonds do not carry default risk. On the other hand, for *t* < *U* ≤ *T* ,

$$P(t,T) = e^{-r(T-t)}E^\*[1\_{\{\tau \ge T\}}|\mathcal{\widetilde{P}}\_t] = e^{-r(T-t)}E^\*[1\_{\{\tau = \infty\}}|\mathcal{\widetilde{P}}\_t] = e^{-r(T-t)}\Phi\left(\frac{W\_l - K}{\sqrt{U-t}}\right),$$

where Φ denotes the cumulative distribution function of a standard normal random variable and *E*<sup>∗</sup> denotes the expectation with respect to *Q*∗. For *t* → *U* we recover *P*(*U*, *U*) = 1{τ=∞}. The derivation of representation (6) with ν(*du*) := *du* + δ*<sup>U</sup>* (*du*) is straightforward. A simple calculation with

$$P(t, T) = \mathbf{1}\_{\{\tau > t\}} \exp\left(-\int\_{t}^{T} f(t, u) du - f(t, U)\mathbf{1}\_{\{t < U \le T\}}\right) \tag{16}$$

yields *f* (*t*, *T* ) = *r* for *T* = *U* and

$$f(t, U) = -\log \phi \left(\frac{W\_t - K}{\sqrt{U - t}}\right).$$

By Itô's formula we obtain

$$b(t,U) = -\frac{\varphi\left(\frac{W\_t - K}{\sqrt{U-t}}\right)}{\Phi\left(\frac{W\_t - K}{\sqrt{U-t}}\right)}(U-t)^{-1/2},$$

and indeed, *<sup>a</sup>*(*t*, *<sup>U</sup>*) <sup>=</sup> <sup>1</sup> <sup>2</sup> *b*<sup>2</sup>(*t*, *U*). Note that the conditions for Proposition 1 hold and, the market consisting of the bonds *P*(*t*, *T* ) satisfies NAFL, as expected. More flexible models of arbitrage-free bond prices can be obtained if the market filtration F is allowed to be more general, as we show in Sect. 3 on affine generalized Merton models.

*Example 2* (*An extension of the Black–Cox model*) The model suggested in [4] uses a first-passage time approach to model credit risk. Default happens at the first time, when the firm value falls below a pre-specified boundary, the default boundary. We consider a stylized version of this approach and continue the Example 1. Extending the original approach, we include a zero-coupon bond with maturity*U*. The reduction of the firm value at *U* is equivalent to considering a default boundary with an upward jump at that time. Hence, we consider a Brownian motion *W* and the default boundary

$$D(t) = D(0) + K\mathbf{1}\_{\{U \ge t\}}, \quad t \ge 0,$$

with *D*(0) < 0, and let default be the first time when *W* hits *D*, i.e.

$$\tau = \inf \{ t \ge 0 \, : \, W\_t \le D(t) \} $$

with the usual convention that inf ∅=∞. The following lemma computes the default probability in this setting and the forward rates are directly obtained from this result together with (16). The filtration G = F is given by the natural filtration of the Brownian motion *W* after completion. Denote the random sets

$$\begin{aligned} \Delta\_1 &:= \left\{ (\mathbf{x}, \mathbf{y}) \in \mathbb{R}^2 : \mathbf{x}\sqrt{T - U} \le D(U) - \left( \mathbf{y}\sqrt{U - t} + W\_l \right), \mathbf{y}\sqrt{U - t} + W\_l > D(0) \right\}, \\ \Delta\_2 &:= \left\{ (\mathbf{x}, \mathbf{y}) \in \mathbb{R}^2 : \mathbf{x}\sqrt{T - U} \le D(U) - \left( \mathbf{y}\sqrt{U - t} + 2D(0) - W\_l \right), \\ \mathbf{y}\sqrt{U - t} + D(0) - W\_l > 0 \right\}. \end{aligned}$$

**Lemma 2** *Let D*(0) < 0*, U* > 0 *and D*(*U*) ≥ *D*(0)*. For* 0 ≤ *t* < *U, it holds on* {τ > *t*}*, that*

$$P(\tau > T | \mathcal{F}\_t) = 1 - 2\Phi\left(\frac{D(0) - W\_t}{\sqrt{T - t}}\right) - \mathbf{l}\_{\{T \ge U\}} 2(\Phi\_2(\Delta\_1) - \Phi\_2(\Delta\_2)), \quad (17)$$

*where* Φ<sup>2</sup> *is the distribution of a two-dimensional standard normal distribution and the sets* Δ*<sup>t</sup>* = Δ*t*(*D*), *t* ≥ *U are given by*

A Generalized Intensity-Based Framework for Single-Name … 277

$$\Delta\_l = \left\{ (\mathbf{x}, \mathbf{y}) \in \mathbb{R}^2 : \mathbf{x}\sqrt{T - U} + \mathbf{y}\sqrt{U} \le -D(U), \right\}.$$

*For t* ≥ *U it holds on* {τ > *t*}*, that*

$$P(\tau > T | \mathcal{F}\_t) = 1 - 2\Phi\left(\frac{D(U) - W\_t}{\sqrt{T - t}}\right).$$

*Proof* The first part of (17) where *T* < *U* follows directly from the reflection principle and the property that *W* has independent and stationary increments. Next, consider 0 ≤ *t* < *U* ≤ *T* . Then, on {*WU* > *D*(*U*)},

$$P(\inf\_{\left[U,T\right]} W > D(U) | \mathcal{F}\_U) = 1 - 2\Phi\left(\frac{D(U) - W\_U}{\sqrt{T - U}}\right). \tag{18}$$

Moreover, on {*Wt* > *D*(0)} it holds for *x* > *D*(0) that

$$\begin{aligned} P(\inf\_{\left[0,U\right]} W > D(0), W\_U > \mathbf{x} | \mathcal{F}\_l) &= P(W\_U > \mathbf{x} | \mathcal{F}\_l) - P(W\_U < \mathbf{x}, \inf\_{\left[0,U\right]} W \le D(0) | \mathcal{F}\_l) \\ &= \Phi\left(\frac{W\_l - \mathbf{x}}{\sqrt{U - t}}\right) - \Phi\left(\frac{2D(0) - \mathbf{x} - W\_l}{\sqrt{U - t}}\right). \end{aligned}$$

Hence, *E*[*g*(*WU* )1{inf[0,*U*] *<sup>W</sup>*>*D*(0)}|*Ft*] = 1{inf[0,*t*] *<sup>W</sup>*>*D*(0)} <sup>∞</sup> *<sup>D</sup>*(0) *g*(*x*) *ft*(*x*)*dx* with density

$$f\_t(\mathbf{x}) = \mathbf{1}\_{\{\mathbf{x} \succ D(0)\}} \frac{1}{\sqrt{U - t}} \left[ \phi \left( \frac{W\_t - \chi}{\sqrt{U - t}} \right) - \phi \left( \frac{2D(0) - \chi - W\_t}{\sqrt{U - t}} \right) \right].$$

Together with (18) this yields on {inf[0,*t*] *W* > *D*(0)}

$$\begin{aligned} P(\inf\_{\left[0,T\right]} (W-D) > 0 | \mathcal{J}\_I) &= \int\_{D(0)}^{\infty} \left[ 1 - 2\Phi\left(\frac{D(U) - x}{\sqrt{T - U}}\right) \right] f\_l(x) dx \\ &= P(\inf\_{\left[t,T\right]} W > D(0) | \mathcal{J}\_I) - 2 \int\_{D(0)}^{\infty} \Phi\left(\frac{D(U) - x}{\sqrt{T - U}}\right) f(x) dx .\end{aligned}$$

It remains to compute the integral. Regarding the first part, letting ξ and η be independent and standard normal, we obtain that

$$\begin{aligned} &\int\_{D(0)}^{\infty} \Phi\left(\frac{D(U)-x}{\sqrt{T-U}}\right) \frac{1}{\sqrt{U-t}} \phi\left(\frac{x-W\_t}{\sqrt{U-t}}\right) dx \\ &= P\_t\Big(\sqrt{T-U}\xi \le D(U) - (\sqrt{U-t}\eta + W\_t), \sqrt{U-t}\eta + W\_t > D(0)\Big), \\ &= \Phi\_2(\Delta\_l), \end{aligned}$$

where we abbreviate *Pt*(·) = *P*(·|*Ft*). In a similar way,

$$\begin{aligned} &\int\_{D(0)}^{\infty} \phi \left( \frac{D(U) - x}{\sqrt{T - U}} \right) \frac{1}{\sqrt{U - t}} \phi \Big( \frac{x - (2D(0) - W\_l)}{\sqrt{U - t}} \Big) dx \\ &= P\_l \Big( \sqrt{T - U} \xi \le D(U) - (\sqrt{U - t} \eta + 2D(0) - W\_l), \sqrt{U - t} \eta + D(0) - W\_l > 0 \Big) \\ &= \Phi\_2(\Delta\_2) \end{aligned}$$

and we conclude.

# **3 Affine Models in the Generalized Intensity-Based Framework**

Affine processes are a well-known tool in the financial literature and one reason for this is their analytical tractability. In this section we closely follow [12] and shortly state the appropriate affine models which fit the generalized intensity framework. For proofs, we refer the reader to this paper.

The main point is that affine processes in the literature are assumed to be *stochastically continuous* (see [8, 10]). Due to the discontinuities introduced in the generalized intensity-based framework, we propose to consider *piecewise continuous affine processes*.

*Example 3* Consider a non-negative integrable function λ, a constant λ ≥ 0 and a deterministic time *u* > 0. Set

$$K(t) = \int\_0^t \lambda(s)ds + \mathbf{1}\_{\{t \ge \underline{u}\}} \kappa, \quad t \ge 0.$$

Let the default time τ be given by τ = inf{*t* ≥ 0 : *Kt* ≥ ζ} with a standard exponential-random variable ζ. Then *P*(τ = *u*) = 1 − *e*−<sup>κ</sup> =: λ . Considering ν(*ds*) = *ds* + δ*u*(*ds*) with *u*<sup>1</sup> = *u* and *w*<sup>1</sup> = 1, we are in the setup of the previous section. The drift condition (9) holds, if

$$f(u,u) = -\log(1 - \lambda') = \kappa.$$

Note, however, that *K* is not the compensator of *H*. Indeed, the compensator of *H* equals <sup>Λ</sup>*<sup>t</sup>* <sup>=</sup> *<sup>t</sup>*∧<sup>τ</sup> <sup>0</sup> λ(*s*)*ds* + 1{*t*≥*u*}λ , see [19] for general results in this direction.

The purpose of this section is to give a suitable extension of the above example involving affine processes. Recall that we consider a σ-finite measure

$$\nu(du) = du + \sum\_{i \ge 1} w\_i \delta\_{u\_i}(du),$$

as well as *A*(*u*) = *u* + *<sup>i</sup>*≥<sup>1</sup> 1{*u*≥*ui*}. The idea is to consider an affine process *X* and study arbitrage-free doubly stochastic term structure models where the compensator Λ of the default indicator process *H* = 1{·≤<sup>τ</sup> } is given by

$$A\_t = \int\_0^t \left(\phi\_0(\mathbf{s}) + \psi\_0(\mathbf{s})^\top \cdot X\_s\right) d\mathbf{s} + \sum\_{i \ge 1} \mathbf{1}\_{\{t \ge u\_i\}} \left(1 - e^{-\phi\_i - \psi\_i^\top \cdot X\_{u\_i}}\right). \tag{19}$$

Note that by continuity of *X*, Λ*t*(ω) < ∞ for almost all ω. To ensure that Λ is nondecreasing we will require that φ0(*s*) + ψ0(*s*) · *Xs* ≥ 0 for all *s* ≥ 0 and φ*<sup>i</sup>* + ψ *i* · *Xui* ≥ 0 for all *i* ≥ 1.

Consider a state space in canonical form *<sup>X</sup>* <sup>=</sup> <sup>R</sup>*<sup>m</sup>* <sup>≥</sup><sup>0</sup> <sup>×</sup> <sup>R</sup>*<sup>n</sup>* for integers *<sup>m</sup>*, *<sup>n</sup>* <sup>≥</sup> <sup>0</sup> with *m* + *n* = *d* and a *d*-dimensional Brownian motion *W*. Let μ and σ be defined on *X* by

$$
\mu(\mathbf{x}) = \mu\_0 + \sum\_{i=1}^d \mathbf{x}\_i \mu\_i,\tag{20}
$$

$$\frac{1}{2}\sigma(\mathbf{x})^\top \sigma(\mathbf{x}) = \sigma\_0 + \sum\_{i=1}^d x\_i \sigma\_i,\tag{21}$$

where μ0, μ*<sup>i</sup>* ∈ R*<sup>d</sup>* , σ0, σ*<sup>i</sup>* ∈ R*d*×*<sup>d</sup>* , for all *i* ∈ {1,..., *d*}. We assume that the parameters μ*<sup>i</sup>* , σ*<sup>i</sup>* , *i* = 0,..., *d* are admissible in the sense of Theorem 10.2 in [11]. Then the continuous, unique strong solution of the stochastic differential equation

$$dX\_t = \mu(X\_t)dt + \sigma(X\_t)dW\_t, \quad X\_0 = \ge,\tag{22}$$

is an *affine* process *X* on the state space *X* , see Chap. 10 in [11] for a detailed exposition.

We call a bond-price model *affine* if there exist functions *A* : R≥<sup>0</sup> × R≥<sup>0</sup> → R, *B* : R≥<sup>0</sup> × R≥<sup>0</sup> → R*<sup>d</sup>* such that

$$P(t,T) = \mathbf{1}\_{\{\tau > t\}} e^{-A(t,T) - B(t,T)^\top \cdot X\_t},\tag{23}$$

for 0 ≤ *t* ≤ *T* ≤ *T* <sup>∗</sup>.We assume that *A*(., *T* ) and *B*(., *T* ) are right-continuous.Moreover, we assume that *t* → *A*(*t*, .) and *t* → *B*(*t*, .) are differentiable from the right and denote by ∂<sup>+</sup> *<sup>t</sup>* the right derivative. For the convenience of the reader we state the following proposition giving sufficient conditions for absence of arbitrage in an affine generalized intensity-based setting. It extends [12] where only finitely many risky times were treated.

**Proposition 1** *Assume that*φ<sup>0</sup> : R≥<sup>0</sup> → R*,*ψ<sup>0</sup> : R≥<sup>0</sup> → R*<sup>d</sup> are continuous,*ψ0(*s*) + ψ0(*s*) · *x* ≥ 0 *for all s* ≥ 0 *and x* ∈ *X and the constants* φ*<sup>i</sup>* ∈ R *and*ψ*<sup>i</sup>* ∈ R*<sup>d</sup> , i* ≥ 1 *satisfy* φ*<sup>i</sup>* + ψ *<sup>i</sup>* · *x* ≥ 0 *for all* 1 ≤ *i* ≤ *n and x* ∈ *X as well as <sup>i</sup>*≥<sup>1</sup> |*wi*|(|φ*i*| + |ψ*<sup>i</sup>*,<sup>1</sup>|+···+|ψ*<sup>i</sup>*,*<sup>d</sup>* |) < ∞*. Moreover, let the functions A* : R≥<sup>0</sup> × R≥<sup>0</sup> → R *and*

*B* : R≥<sup>0</sup> × R≥<sup>0</sup> → R*<sup>d</sup> be the unique solutions of*

$$\begin{aligned} A(T, T) &= 0\\ A(u\_i, T) &= A(u\_i - , T) - \phi\_i w\_i\\ -\partial\_t^+ A(t, T) &= \phi\_0(t) + \mu\_0^\top \cdot B(t, T) - B(t, T)^\top \cdot \sigma\_0 \cdot B(t, T), \end{aligned} \tag{24}$$

*and*

$$\begin{aligned} B(T, T) &= 0\\ B\_k(u\_i, T) &= B\_k(u\_i - , T) - \psi\_{i,k} w\_i\\ -\partial\_t^+ B\_k(t, T) &= \psi\_{0,k}(t) + \mu\_k^\top \cdot B(t, T) - B(t, T)^\top \cdot \sigma\_k \cdot B(t, T), \end{aligned} \tag{25}$$

*for* 0 ≤ *t* ≤ *T . Then, the doubly-stochastic affine model given by* (19) *and* (23) *satisfies NAFL.*

*Proof* By construction,

$$A(t,T) = \int\_{t}^{T} a'(t,u)du + \sum\_{i:u\_{i}\in(t,T]} \phi\_{i}w\_{i}$$

$$B(t,T) = \int\_{t}^{T} b'(t,u)du + \sum\_{i:u\_{i}\in(t,T]} \psi\_{i}w\_{i}$$

with suitable functions *a* and *b* and *a* (*t*, *t*) = φ0(*t*) as well as *b* (*t*, *t*) = ψ0(*t*). A comparison of (23) with (6) yields the following: on the one hand, for *T* = *ui* ∈ *U* , we obtain *f* (*t*, *ui*) = φ*<sup>i</sup>* + ψ *<sup>i</sup>* · *Xt* . Hence, the coefficients *a*(*t*, *T* ) and *b*(*t*, *T* ) in (7) for *T* = *ui* ∈ *U* compute to *a*(*t*, *ui*) = ψ *<sup>i</sup>* · μ(*Xt*) and *b*(*t*, *ui*) = ψ *<sup>i</sup>* · σ(*Xt*).

On the other hand, for *T* ∈/ *U* we obtain that *f* (*t*, *T* ) = *a* (*t*, *T* ) + *b* (*t*, *T* ) · *Xt* . Then, the coefficients *a*(*t*, *T* ) and *b*(*t*, *T* ) can be computed as follows: applying Itô's formula to *f* (*t*, *T* ) and comparing with (7) yields that

$$\begin{aligned} a(t, T) &= \partial\_l a'(t, T) + \partial\_l b'(t, T)^\top \cdot X\_l + b'(t, T)^\top \cdot \mu(X\_l) \\ b(t, T) &= b'(t, T)^\top \cdot \sigma(X\_l) . \end{aligned} \tag{26}$$

Set *a*¯ (*t*, *<sup>T</sup>* ) <sup>=</sup> *<sup>T</sup> <sup>t</sup> a* (*t*, *u*)*du* and *b*¯ (*t*, *<sup>T</sup>* ) <sup>=</sup> *<sup>T</sup> <sup>t</sup> b* (*t*, *u*)*du* and note that,

$$\int\_{t}^{T} \partial\_{t}a'(t,u)du = \partial\_{t}\bar{a}'(t,T) + a'(t,t).$$

As ∂<sup>+</sup> *<sup>t</sup> A*(*t*, *T* ) = ∂*ta*¯ (*t*, *T* ), and ∂<sup>+</sup> *<sup>t</sup> B*(*t*, *T* ) = ∂*tb*¯ (*t*, *T* ), we obtain from (26) that A Generalized Intensity-Based Framework for Single-Name … 281

$$\begin{aligned} \bar{a}(t,T) &= \int\_{t}^{T} a(t,u)\nu(du) = \int\_{t}^{T} a(t,u)du + \sum\_{u \in (t,T]} \boldsymbol{w}\_{i} \boldsymbol{\psi}\_{i}^{\top} \cdot \mu(X\_{t}) \\ &= \partial\_{t}^{+} A(t,T) + a^{'}(t,t) + \left(\partial\_{t}^{+} B(t,T) + b^{'}(t,t)\right)^{\top} \cdot X\_{t} + B(t,T)^{\top} \cdot \mu(X\_{t}), \\ \bar{b}(t,T) &= \int\_{t}^{T} b(t,u)\nu(du) = \int\_{t}^{T} b(t,u)du + \sum\_{u\_{i} \in (t,T]} \boldsymbol{w}\_{i} \boldsymbol{\psi}\_{i}^{\top} \cdot \sigma(X\_{t}) \\ &= B(t,T)^{\top} \cdot \sigma(X\_{t}) \end{aligned}$$

for 0 ≤ *t* ≤ *T* ≤ *T* <sup>∗</sup>. We now show that under our assumptions, the drift conditions (9) and (10) hold: Observe that, by Eqs. (24), (25), and the affine specification (20), and (21), the drift condition (10) holds. Moreover, from (11),

$$
\Delta H'(\boldsymbol{\mu}\_i) = \phi\_i + \boldsymbol{\psi}\_i^\top \cdot \boldsymbol{X}\_{\boldsymbol{\mu}\_i}
$$

and λ*<sup>s</sup>* = φ0(*s*) + ψ0(*s*) · *Xs* by (19). We recover ΔΛ*ui* = 1 − exp(−φ*<sup>i</sup>* − ψ *i* · *Xui*) taking values in [0, 1) by assumption. Hence, (9) holds and the claim follows.

*Example 4* In the one-dimensional case we consider *X*, given as solution of

$$dX\_t = (\mu\_0 + \mu\_1 X\_t)dt + \sigma \sqrt{X\_t}dW\_t, \quad t \ge 0.$$

Consider only one risky time *u*<sup>1</sup> = 1 and let φ<sup>0</sup> = φ<sup>1</sup> = 0, ψ<sup>0</sup> = 1, such that

$$A = \int\_0^t X\_s ds + \mathbf{1}\_{\{u \ge 1\}} (1 - e^{-\psi\_1 X\_1}) .$$

Hence the probability of having no default at time 1 just prior to 1 is given by *e*−ψ1*X*<sup>1</sup> , compare Example 3.

An arbitrage-free model can be obtained by choosing *A* and *B* according to Proposition 1 which can be immediately achieved using Lemma 10.12 from [11] (see in particular Sect. 10.3.2.2 on the CIR short-rate model): denote θ = μ2 <sup>1</sup> + 2σ<sup>2</sup> and

$$\begin{aligned} L\_1(t) &= 2(e^{\theta t} - 1), \\ L\_2(t) &= \theta(e^{\theta t} + 1) + \mu\_1(e^{\theta t} - 1), \\ L\_3(t) &= \theta(e^{\theta t} + 1) - \mu\_1(e^{\theta t} - 1), \\ L\_4(t) &= \sigma^2(e^{\theta t} - 1). \end{aligned}$$

Then

$$A\_0(s) = \frac{2\mu\_0}{\sigma^2} \log \left(\frac{2\theta e^{\frac{(\sigma-\mu\_1)t}{2}}}{L\_3(t)}\right), \quad B\_0(s) = -\frac{L\_1(t)}{L\_3(t)}$$

are the unique solutions of the Riccati equations *B* <sup>0</sup> = σ2*B*<sup>2</sup> <sup>0</sup> − μ1*B*<sup>0</sup> with boundary condition *B*0(0) = 0 and *A* <sup>0</sup> = −μ0*B*<sup>0</sup> with boundary condition *A*0(0) = 0. Note that with *A*(*t*, *T* ) = *A*0(*T* − *t*) and *B*(*t*, *T* ) = *B*0(*T* − *t*) for 0 ≤ *t* ≤ *T* < 1, the conditions of Proposition 1 hold. Similarly, for 1 ≤ *t* ≤ *T* , choosing *A*(*t*, *T* ) = *A*0(*T* − *t*) and *B*(*t*, *T* ) = *B*0(*T* − *t*) implies again the validity of (24) and (25). On the other hand, for 0 ≤ *t* < 1 and *T* ≥ 1 we set *u*(*T* ) = *B*(1, *T* ) + ψ<sup>1</sup> = *B*0(*T* − 1) + ψ1, according to (25), and let

$$\begin{aligned} A(t,T) &= \frac{2\mu\_0}{\sigma^2} \log \left( \frac{2\theta e^{\frac{(\sigma-\mu\_1)(1-t)}{2}}}{L\_3(1-t) - L\_4(1-t)\mu(T)} \right), \\ B(t,T) &= -\frac{L\_1(1-t) - L\_2(1-t)\mu(T)}{L\_3(1-t) - L\_4(1-t)\mu(T)}. \end{aligned}$$

It is easy to see that (24) and (25) are also satisfied in this case, in particular Δ*A*(1, *T* ) = −φ<sup>1</sup> = 0 and Δ*B*(1, *T* ) = −ψ1. Note that, while *X* is continuous, the bond prices are not even stochastically continuous because they jump almost surely at *u*<sup>1</sup> = 1. We conclude by Proposition 1 that this affine model is arbitrage-free.

# **4 Conclusion**

In this article we studied a new class of dynamic term structure models with credit risk where the compensator of the default time may jump at predictable times. This framework was called generalized intensity-based framework. It extends existing theory and allows to include Merton's model, in a reduced-form model for pricing credit derivatives. Finally, we studied a class of highly tractable affine models which are only piecewise stochastically continuous.

**Acknowledgements** The KPMG Center of Excellence in Risk Management is acknowledged for organizing the conference "Challenges in Derivatives Markets - Fixed Income Modeling, Valuation Adjustments, Risk Management, and Regulation".

**Open Access** This chapter is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, duplication, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, a link is provided to the Creative Commons license and any changes made are indicated.

The images or other third party material in this chapter are included in the work's Creative Commons license, unless indicated otherwise in the credit line; if such material is not included in the work's Creative Commons license and the respective action is not permitted by statutory regulation, users will need to obtain permission from the license holder to duplicate, adapt or reproduce the material.

# **References**


# **Option Pricing and Sensitivity Analysis in the Lévy Forward Process Model**

**Ernst Eberlein, M'hamed Eddahbi and Sidi Mohamed Lalaoui Ben Cherif**

**Abstract** The purpose of this article is to give a closed Fourier-based valuation formula for a caplet in the framework of the Lévy forward process model which was introduced in Eberlein and Özkan, Financ. Stochast. 9:327-348, 2005, [5]. Afterwards, we compute Greeks by two approaches which come from totally different mathematical fields. The first is based on the integration-by-parts formula, which lies at the core of the application of the Malliavin calculus to finance. The second consists in using Fourier-based methods for pricing derivatives as exposed in Eberlein, Quantitative Energy Finance, 2014, [3]. We illustrate the results in the case where the jump part of the underlying model is driven by a time-inhomogeneous Gamma process and alternatively by a Variance Gamma process.

**Keywords** Option valuation · Lévy forward process model · Fourier transform · Time-inhomogeneous Lévy processes · Malliavin calculus · Greeks and sensitivity analysis

E. Eberlein (B)

Department of Mathematical Stochastics, University of Freiburg, Eckerstr. 1, 79104 Freiburg im Breisgau, Germany e-mail: eberlein@stochastik.uni-freiburg.de

M. Eddahbi

S.M. Lalaoui Ben Cherif Faculty of Sciences Semlalia, Department of Mathematics, Cadi Ayyad University, B.P. 2390, Marrakech, Morocco e-mail: mohamed.lalaoui@ced.uca.ac.ma

We acknowledge financial support from the Federal Foreign Office of Germany which has been granted within a program of the German Academic Exchange Service (DAAD).

Faculty of Sciences and Techniques, Department of Mathematics, Cadi Ayyad University, B.P. 549, Marrakech, Morocco e-mail: m.eddahbi@uca.ma

# **1 Introduction**

To compute expectations which arise as prices of derivative products is a key issue in quantitative finance. The effort which is necessary to get these values depends to a high degree on the sophistication of the model approach which is used. Simple models such as the classical geometric Brownian motion lead to easy-to-evaluate formulas for expectations but entail at the same time a high model risk. As has been shown in numerous studies, the empirical return distributions which one can observe are far from normality. This is true for all categories of financial markets: equity, fixed income, foreign exchange as well as credit markets (see e.g. Eberlein and Keller (1995) [4] for the analysis of stock price data and Eberlein and Kluge (2007) [7] for data from fixed income markets). A first step to reduce model risk and to improve the performance of the model consists in introducing volatility as a stochastic quantity. Some of the stochastic volatility models became quite popular. Nevertheless one must be aware that the distributions which diffusion processes with non-deterministic coefficients generate on a given time horizon are not known. They can only be determined approximately on the basis of simulations of process paths. In order to get more realistic distributions, an excellent choice is to replace the driving Brownian motion in classical models by a suitably chosen Lévy process. This can also be interpreted in the sense that instead of making volatility stochastic one can go over to a stochastic clock. The reason is that many Lévy processes can be obtained as time-changed Brownian motions. For example, the Variance Gamma process results when one replaces linear time by a Gamma process as subordinator. Of course, one can also consider both: a more powerful driver and stochastic volatility.

Lévy processes are in a one-to-one correspondence to the rich class of infinitely divisible distributions and at the same time analytically well tractable. Due to the higher number of available parameters, this class of distributions is flexible enough to allow a much better fit to empirical return distributions. The systematic error which results from the assumption of normality is avoided. The generating distribution of a Lévy process shows up as the distribution of increments of length one. Consequently, any distribution which one gets by fitting a parametrized subclass to empirical return data can be implemented not only approximately but exactly into Lévy-driven models. Suitably parametrized model classes which have been used successfully so far are driven by generalized hyperbolic, normal inverse Gaussian (NIG), or Variance Gamma (VG) processes, just to mention a few.

As noted above, advanced models with superior statistical properties require more demanding numerical methods. Efficient and accurate algorithms are crucial in this context, in particular for calibration purposes. For pricing of derivatives the historical distribution, which can be derived from price data of the underlying and which is used for risk management, is of less interest. Calibration usually means to estimate the risk-neutral distribution parameters. In other words, one exploits price data of derivatives. In most cases this is given in terms of volatilities. Whereas years ago calibration was usually done overnight, many trading desks recalibrate nowadays on an intraday basis. During a calibration procedure in each iteration step a large number of model prices have to be computed and compared to market prices. A method which almost always works to get the corresponding expectations is Monte Carlo simulation. Its disadvantage is that it is computationally intensive and therefore too slow for many purposes. Another classical approach is to represent prices as solutions of partial differential equations (PDEs) which in the case of Lévy processes with jumps become partial integro–differential equations (PIDEs). This approach, which is based on the Feynman–Kac formula, applies to a wide range of valuation problems, in particular it allows to compute prices of American options as well. Nevertheless, the numerical solution of PIDEs rests on sophisticated discretization methods and corresponding programs. In this paper we concentrate on the third, namely the Fourier-based approach.

To manage portfolios of derivatives, traders have to understand how sensitive prices of derivative products are with respect to changes in the underlying parameters. For this purpose they need to know the Greeks which are given by the partial derivatives of the pricing functional with respect to those parameters. Usually Greeks are estimated by means of a finite difference approximation. Two kinds of errors are produced this way: the first one comes from the approximation of the derivative by a finite difference and the second one results from the numerical computation of the expectation. To eliminate one of the sources of error, Fournié et al. (1999) [9] adopted a new approach which consists in shifting the differential operator from the pricing functional to the diffusion kernel. This procedure results in an expectation operator applied to the payoff multiplied by a random weight function.

In the following we focus on a discrete tenor interest rate model which has been introduced in Eberlein and Özkan (2005) [5]. This so-called Lévy forward process model is driven by a time-inhomogeneous Lévy process and is developed on the basis of a backward induction that is necessary to get the LIBOR rates in a convenient homogeneous form. A major advantage of the forward process approach is that it is invariant under the measure change in the sense that the driving process remains a time-inhomogeneous Lévy process. Moreover, the measure changes do not only have the invariance property but in addition they are analytically and consequently also numerically much simpler compared to the corresponding measure changes in the so-called LIBOR model. The reason is that in each induction step the forward process itself represents up to a norming constant the density process on which the measure change is based. As a consequence, any approximation such as the 'frozen drift' approximation or more sophisticated versions of it are completely avoided. This means that the approximation error with which one has to struggle in the LIBOR approach does not show up in the forward process approach.

Another important aspect is that in the latter model the increments of the driving process translate directly into increments of the LIBOR rates. This is not the case for the LIBOR model where the increments of the LIBOR rates are proportional to the corresponding increments of the driving process scaled with the current value of the LIBOR rate. Expressed in terms of the terminology which will be developed in Sects. 2 and 3 this means that in the Lévy LIBOR model

$$L(t + \Delta t, T\_k) - L(t, T\_k) \sim L(t, T\_k) \left( L\_{t + \Delta t}^{T\_{k+1}} - L\_t^{T\_{k+1}} \right), \tag{1}$$

whereas in the Lévy forward process model

$$L(t + \Delta t, T\_k) - L(t, T\_k) \sim \delta\_k^{-1} \left( L\_{t + \Delta t}^{T\_{k+1}} - L\_t^{T\_{k+1}} \right). \tag{2}$$

The fact that the increments of the LIBOR rate process do not depend on current LIBOR values, translates into increased flexibility and a superior model performance of the forward process approach.

In addition to the differences in mathematical properties there is a fundamental economic difference. The forward process approach allows for negative interest rates as well as for negative starting values. This is of crucial importance in particular in the current economic environment where negative rates are common. Models where by construction interest rates stay strictly positive are not able to produce realistic valuations for a large collection of interest rate derivatives in a deflationary or neardeflationary environment.

As far as the calculation of Greeks in this setting is concerned, we refer to Glasserman and Zhao (1999) [12], Glasserman (2004) [11], and Fries (2007) [10] where some treatment of this issue is given. The classical diffusion-based LIBOR market model offers a high degree of analytical tractability. However, this model cannot reproduce the phenomenon of changing volatility smiles along the maturity axis. In order to gain more flexibility in a first step one can replace the driving Brownian motion by a (time-homogeneous) Lévy process. However, one observes that the shape of the volatility surface produced by cap and floor prices is too sophisticated in order to be matched with sufficient accuracy by a model which is driven by a time-homogeneous process. To achieve a more accurate calibration of the model across different strikes and maturities one has to use the more flexible class of time-inhomogeneous Lévy processes (see e.g. Eberlein and Özkan (2005) [5] and Eberlein and Kluge (2006) [6]). Graphs in the latter paper show in particular that interest rate models driven by time-inhomogeneous Lévy processes are able to reproduce implied volatility curves (smiles) observed in the market across all maturities with high accuracy. If one restricts the approach to (time-homogeneous) Lévy processes as drivers, the smiles flatten out too fast at longer maturities. Consequently, we have analytical the invariance under measure changes—as well as statistical reasons to choose timeinhomogeneous Lévy processes as drivers. In implementations of the model already a rather mild form of time-inhomogeneity turns out to be sufficient. Typically one has to glue together three pieces of (time-homogeneous) Lévy processes in order to cover the full range of maturities with sufficient accuracy. In terms of parameters this means that instead of three or four one uses nine or twelve parameters.

The first goal of this paper is to give a closed Fourier-based valuation formula for a caplet in the framework of the Lévy forward process model. The second aim is to study sensitivities. We discuss two approaches for this purpose. The first is based on the integration-by-parts formula, which lies at the core of the application of the Malliavin calculus to finance as developed in Fournié et al. (1999) [9], León et al. (2002) [14], Petrou (2008) [17], Yablonski (2008) [19]. This approach is appropriate if the driving process has a diffusion component. The second approach which covers purely discontinuous drivers as well relies on Fourier-based methods for pricing derivatives. For a survey of Fourier-based methods see Eberlein (2014) [3]. We illustrate the result by applying the formula to the pricing of a caplet where the jumppart of the underlying model is driven by a time-inhomogeneous Gamma process and alternatively by a Variance Gamma process.

# **2 The Lévy Forward Process Model**

Let 0 = *T*<sup>0</sup> < *T*<sup>1</sup> < ··· < *Tn*−<sup>1</sup> < *Tn* = *T* <sup>∗</sup> denote a discrete tenor structure and set δ*<sup>k</sup>* = *Tk*+<sup>1</sup> − *Tk* for all *k* ∈ {0,..., *n* − 1}. Because we proceed by backward induction, let us use the notation *T* <sup>∗</sup> *<sup>i</sup>* := *Tn*−*<sup>i</sup>* and δ<sup>∗</sup> *<sup>i</sup>* = δ*<sup>n</sup>*−*<sup>i</sup>* for *i* ∈ {1,..., *n*}. For zerocoupon bond prices *B*(*t*, *T* <sup>∗</sup> *<sup>i</sup>* ) and *B*(*t*, *T* <sup>∗</sup> *<sup>i</sup>*−<sup>1</sup>), the forward process is defined by

$$F(t, T\_i^\*, T\_{i-1}^\*) = \frac{B(t, T\_i^\*)}{B(t, T\_{i-1}^\*)}.\tag{3}$$

Hence, modeling forward processes means specifying the dynamics of ratios of successive bond prices. Let (Ω; *F*=*F<sup>T</sup>* <sup>∗</sup> ; F; P*<sup>T</sup>* <sup>∗</sup> ) be a complete stochastic basis where P*<sup>T</sup>* <sup>∗</sup> should be regarded as the forward martingale measure for the settlement date *T* <sup>∗</sup> > 0 and the filtration F=(*Ft*)*<sup>t</sup>*∈[0,*<sup>T</sup>* ∗] satisfies the usual conditions. Consider a time-inhomogeneous Lévy process *L<sup>T</sup>* <sup>∗</sup> defined on (Ω; *F* =*F<sup>T</sup>* <sup>∗</sup> ; F; P*<sup>T</sup>* <sup>∗</sup> ) starting at 0 with local characteristics (*b<sup>T</sup>* <sup>∗</sup> , *c*, *F<sup>T</sup>* <sup>∗</sup> ) such that the drift term *b<sup>T</sup>* <sup>∗</sup> *<sup>s</sup>* ∈ R, the volatility coefficient *cs* and the Lévy measure *F<sup>T</sup>* <sup>∗</sup> *<sup>s</sup>* satisfy the following conditions

$$\exists \; \sigma > 0, \; \forall \; s \in [0, T^\*] : \; c\_s > \; \sigma, \; \; F\_s^{T^\*} (\{0\}) \; = \; 0 \tag{4}$$

and

$$\int\_0^{T^\*} \left( |b\_s^{T^\*}| + |c\_s| + \int\_{\mathbb{R}} \left( |x|^2 \wedge 1 \right) F\_s^{T^\*} (dx) \right) ds < \infty. \tag{5}$$

We impose as usual a further integrability condition. Note that the processes which we will define later, are by construction martingales and therefore every single random variable has to be integrable.

**Assumption 2.1** (EM) There exists a constant *M* > 1 such that

$$\int\_{0}^{T^\*} \int\_{\{|x|>1\}} \exp(ux) F\_s^{T^\*}(dx) ds < \infty, \quad \forall \ u \in [-M, M]. \tag{6}$$

Under (EM) the random variable *L<sup>T</sup>* <sup>∗</sup> *<sup>t</sup>* has a finite expectation and its law is given by the characteristic function

$$\mathbb{E}\left[e^{\mathbf{i}\mathbf{u}\cdot\mathbf{L}\_t^{T^\*}}\right] = \exp\left(\int\_0^t \left(\mathbf{i}\imath b\_s^{T^\*} - \frac{1}{2}\mu^2 c\_s + \int\_{\mathbb{R}}\left(e^{\mathbf{i}\imath \mathbf{x}\cdot\mathbf{x}} - 1 - \mathbf{i}\imath \mathbf{x}\right) F\_s^{T^\*}(d\mathbf{x})\right) ds\right). \tag{7}$$

Furthermore, the process *L<sup>T</sup>* <sup>∗</sup> is a special semimartingale, and thus its canonical representation has the simple form

$$L\_t^{T^\*} = \int\_0^t b\_s^{T^\*} ds + \int\_0^t \sqrt{c\_s} dW\_s^{T^\*} + \int\_0^t \int\_{\mathbb{R}} \mathbf{x} \tilde{\boldsymbol{\mu}}^{L^{T^\*}} (ds, d\mathbf{x}), \tag{8}$$

where (*W <sup>T</sup>* <sup>∗</sup> *<sup>t</sup>* )*<sup>t</sup>*≥<sup>0</sup> is a <sup>P</sup>*<sup>T</sup>* <sup>∗</sup> -standard Brownian motion and<sup>μ</sup>*<sup>L</sup><sup>T</sup>* <sup>∗</sup> := <sup>μ</sup>*<sup>L</sup><sup>T</sup>* <sup>∗</sup> − ν*<sup>T</sup>* <sup>∗</sup> is the P*<sup>T</sup>* <sup>∗</sup> compensated random measure of jumps of *L<sup>T</sup>* <sup>∗</sup> . As usual, μ*<sup>L</sup><sup>T</sup>* <sup>∗</sup> denotes the random measure of jumps of *L<sup>T</sup>* <sup>∗</sup> and ν*<sup>T</sup>* <sup>∗</sup> (*ds*, *dx*) := *F<sup>T</sup>* <sup>∗</sup> *<sup>s</sup>* (*dx*)*ds* the P*<sup>T</sup>* <sup>∗</sup> -compensator of μ*<sup>L</sup><sup>T</sup>* <sup>∗</sup> . We denote by θ*<sup>s</sup>* the cumulant function associated with the process *L<sup>T</sup>* <sup>∗</sup> as given in (8) with local characteristics (*b<sup>T</sup>* <sup>∗</sup> , *c*, *F<sup>T</sup>* <sup>∗</sup> ), that is, for appropriate *z* ∈ C

$$\theta\_s(z) = zb\_s^{T^\*} + \frac{z^2}{2}c\_s + \int\_{\mathbb{R}} \left( e^{z\chi} - 1 - z\chi \right) F\_s^{T^\*} (d\chi), \tag{9}$$

where *c* and *F<sup>T</sup>* <sup>∗</sup> are free parameters, whereas the drift characteristic *b<sup>T</sup>* <sup>∗</sup> will later be chosen to guarantee that the forward process is a martingale. The following ingredients are needed.

**Assumption 2.2** (LR.1) For any maturity *T* <sup>∗</sup> *<sup>i</sup>* there is a bounded, deterministic function λ(·, *T* <sup>∗</sup> *<sup>i</sup>* ) : [0, *T* ∗] −→ R which represents the volatility of the forward process *F*(·, *T* <sup>∗</sup> *<sup>i</sup>* , *T* <sup>∗</sup> *<sup>i</sup>*−<sup>1</sup>). These functions satisfy

λ(*s*, *T* <sup>∗</sup> *<sup>i</sup>* ) > 0, ∀ *s* ∈ [0, *T* <sup>∗</sup> *<sup>i</sup>* ] and λ(*s*, *T* <sup>∗</sup> *<sup>i</sup>* ) = 0 for *s* > *T* <sup>∗</sup> *<sup>i</sup>* for any maturity *T* <sup>∗</sup> *i* , *<sup>n</sup>*−<sup>1</sup> *<sup>i</sup>*=<sup>1</sup> λ(*s*, *T* <sup>∗</sup> *<sup>i</sup>* ) ≤ *M*, ∀ *s* ∈ [0, *T* ∗] where *M* is the constant from Assumption (EM).

**Assumption 2.3** (LR.2) The initial term structure of zero-coupon bond prices *B*(0, *T* <sup>∗</sup> *<sup>i</sup>* ) is strictly positive for all *i* ∈ {1,..., *n*}.

We begin to construct the forward process with the most distant maturity and postulate

$$F(t, T\_1^\*, T^\*) = F(0, T\_1^\*, T^\*) \exp\left(\int\_0^t \lambda(s, T\_1^\*) dL\_s^{T^\*}\right). \tag{10}$$

One forces this process to become a P*<sup>T</sup>* <sup>∗</sup> -martingale by choosing *b<sup>T</sup>* <sup>∗</sup> such that Option Pricing and Sensitivity Analysis … 291

$$\begin{aligned} \int\_0^t \lambda(s, T\_1^\*) b\_s^{T^\*} ds &= -\frac{1}{2} \int\_0^t c\_s \lambda^2(s, T\_1^\*) ds \\ &- \int\_0^t \int\_{\mathbb{R}} \left( e^{\mathbf{x}\lambda(s, T\_1^\*)} - 1 - \mathbf{x}\lambda(s, T\_1^\*) \right) \nu^{T^\*}(ds, dx). \end{aligned} (11)$$

Then the forward process *F*(·, *T* <sup>∗</sup> <sup>1</sup> , *T* <sup>∗</sup>) can be given as a stochastic exponential

$$F(t, T\_1^\*, T^\*) = F(0, T\_1^\*, T^\*) \ell\_t^\diamond \left( Z(\cdot, T\_1^\*) \right) \tag{12}$$

with

$$Z(t, T\_1^\*) = \int\_0^t \sqrt{c\_s} \lambda(s, T\_1^\*) dW\_s^{T^\*} + \int\_0^t \int\_{\mathbb{R}} (e^{\mathbf{x}\lambda(s, T\_1^\*)} - 1) \widetilde{\mu}^{L^{T^\*}}(ds, dx). \tag{13}$$

Since the forward process *F*(·, *T* <sup>∗</sup> <sup>1</sup> , *T* <sup>∗</sup>)is a P*<sup>T</sup>* <sup>∗</sup> -martingale, we can use it as a density process and define the forward martingale measure P*<sup>T</sup>* <sup>∗</sup> <sup>1</sup> by setting

$$\frac{d\mathbb{P}\_{T\_1^\*}}{d\mathbb{P}\_{T^\*}} = \frac{F(T\_1^\*, T\_1^\*, T^\*)}{F(0, T\_1^\*, T^\*)} = \delta\_{T\_1^\*}^\circ \left(Z(\cdot, T\_1^\*)\right). \tag{14}$$

By the semimartingale version of Girsanov's theorem (see Jacod and Shiryaev (1987) [13])

$$W\_t^{T\_1^\*} := W\_t^{T^\*} - \int\_0^t \sqrt{c\_s} \lambda(s, T\_1^\*) ds \tag{15}$$

is a P*<sup>T</sup>* <sup>∗</sup> <sup>1</sup> -standard Brownian motion and

$$\boldsymbol{\upsilon}^{T^\*\_{\mathbb{I}}}(dt, d\mathbf{x}) := \boldsymbol{e}^{\mathbf{x}\boldsymbol{\lambda}(\mathbf{s}, T^\*\_{\mathbb{I}})} \boldsymbol{\upsilon}^{T^\*}(dt, d\mathbf{x}) = \boldsymbol{e}^{\mathbf{x}\boldsymbol{\lambda}(\mathbf{s}, T^\*\_{\mathbb{I}})} F^{T^\*}\_{\mathbb{s}}(d\mathbf{x}) ds \tag{16}$$

is the P*<sup>T</sup>* <sup>∗</sup> <sup>1</sup> -compensator of <sup>μ</sup>*<sup>L</sup><sup>T</sup>* <sup>∗</sup> .

Continuing this way one gets the forward processes *F*(·, *T* <sup>∗</sup> *<sup>i</sup>* , *T* <sup>∗</sup> *<sup>i</sup>*−<sup>1</sup>) such that for all *i* ∈ {1,..., *n*}

$$F(t, T\_i^\*, T\_{i-1}^\*) = F(0, T\_i^\*, T\_{i-1}^\*) \exp\left(\int\_0^t \lambda(s, T\_i^\*) dL\_s^{T\_{i-1}^\*}\right). \tag{17}$$

The drift term *b<sup>T</sup>* <sup>∗</sup> *<sup>i</sup>*−<sup>1</sup> is chosen in such a way that the forward process *F*(·, *T* <sup>∗</sup> *<sup>i</sup>* , *T* <sup>∗</sup> *<sup>i</sup>*−<sup>1</sup>) becomes a martingale under the forward measure P*<sup>T</sup>* <sup>∗</sup> *i*−1 , that is

$$\begin{split} \int\_{0}^{t} \lambda(s, T\_{i}^{\*}) b\_{s}^{T\_{l-1}^{\*}} ds &= -\frac{1}{2} \int\_{0}^{t} c\_{s} \lambda^{2}(s, T\_{i}^{\*}) ds \\ &- \int\_{0}^{t} \int\_{\mathbb{R}} \left( e^{\chi \lambda(s, T\_{i}^{\*})} - 1 - \chi \lambda(s, T\_{i}^{\*}) \right) \nu^{T\_{l-1}^{\*}}(ds, dx). \end{split} (18)$$

We propose the following choice for the functions *b<sup>T</sup>* <sup>∗</sup> *<sup>i</sup>*−<sup>1</sup> for all *i* ∈ {1,..., *n*}

$$\begin{cases} b\_s^{T\_{i-1}^\*} = -\frac{c\_s}{2} \lambda(s, T\_i^\*) - \int\_{\mathbb{R}} \left( \frac{e^{\mathbf{x}\lambda(s, T\_i^\*)} - 1}{\lambda(s, T\_i^\*)} - \mathbf{x} \right) F\_s^{T\_{i-1}^\*} (d\mathbf{x}), & 0 \le s < T\_i^\*\\ b\_s^{T\_{i-1}^\*} = 0, & s \ge T\_i^\*. \end{cases} (19)$$

The driving process *L<sup>T</sup>* <sup>∗</sup> *<sup>i</sup>*−<sup>1</sup> becomes therefore

$$\begin{split} L\_{l}^{T\_{l-1}^\*} &= -\int\_0^l \left( \frac{c\_s}{2} \lambda(s, T\_i^\*) + \int\_{\mathbb{R}} \left( \frac{e^{\mathbf{x}\lambda(s, T\_i^\*)} - 1}{\lambda(s, T\_i^\*)} - \mathbf{x} \right) F\_s^{T\_{l-1}^\*}(d\mathbf{x}) \right) ds \\ &+ \int\_0^l \sqrt{c\_s} dW\_s^{T\_{l-1}^\*} + \int\_0^l \int\_{\mathbb{R}} \mathbf{x} (\mu^{T^\*} - \nu^{T\_{l-1}^\*})(ds, d\mathbf{x}) \end{split} \tag{20}$$

under the successive forward measures P*<sup>T</sup>* <sup>∗</sup> *<sup>i</sup>* which are given by the recursive relation

$$\begin{cases} \frac{d\mathbb{P}\_{T\_i^\*}}{d\mathbb{P}\_{T\_{i-1}^\*}} = \frac{F(T\_i^\*, T\_i^\*, T\_{i-1}^\*)}{F(0, T\_i^\*, T\_{i-1}^\*)} = \mathcal{el}\_{T\_i^\*}^\circ \left( Z(\cdot, T\_i^\*) \right), \quad i \in \{1, \dots, n\} \\\\ \mathbb{P}\_{T\_0^\*} = \mathbb{P}\_{T^\*} \end{cases} \tag{21}$$

with

$$Z(t, T\_i^\*) = \int\_0^t \sqrt{c\_s} \lambda(s, T\_i^\*) dW\_s^{T\_{i-1}^\*} + \int\_0^t \int\_{\mathbb{R}} (e^{\mathbf{x}\lambda(s, T\_i^\*)} - 1) \tilde{\mu}^{L^{T\_{i-1}^\*}}(ds, d\mathbf{x}), \tag{22}$$

where (*W <sup>T</sup>* <sup>∗</sup> *i*−1 *<sup>t</sup>* )*<sup>t</sup>*≥<sup>0</sup> is a P*<sup>T</sup>* <sup>∗</sup> *i*−1 -standard Brownian motion such that

$$\begin{cases} W\_t^{T^\*\_i} = W\_t^{T^\*\_{i-1}} - \int\_0^t \sqrt{c\_s} \lambda(s, T^\*\_i) ds, \quad i \in \{1, \dots, n\} \\\\ W\_t^{T^\*\_0} = W\_t^{T^\*} . \end{cases} \tag{23}$$

μ*LT* ∗ *<sup>i</sup>*−<sup>1</sup> := <sup>μ</sup>*<sup>L</sup><sup>T</sup>* <sup>∗</sup> − ν*<sup>T</sup>* <sup>∗</sup> *<sup>i</sup>*−<sup>1</sup> is the P*<sup>T</sup>* <sup>∗</sup> *i*−1 -compensated random measure of jumps of *L<sup>T</sup>* <sup>∗</sup> and ν*<sup>T</sup>* <sup>∗</sup> *<sup>i</sup>*−<sup>1</sup> (*ds*, *dx*) <sup>=</sup> *<sup>F</sup><sup>T</sup>* <sup>∗</sup> *i*−1 *<sup>s</sup>* (*dx*)*ds* is the P*<sup>T</sup>* <sup>∗</sup> *i*−1 -compensator of μ*<sup>L</sup><sup>T</sup>* <sup>∗</sup> such that

$$\begin{cases} F\_s^{T^\*\_i}(d\mathbf{x}) = e^{\mathbf{x}\lambda(\mathbf{x}, T^\*\_i)} F\_s^{T^\*\_{i-1}}(d\mathbf{x}), \quad i \in \{1, \ldots, n\} \\\\ F\_s^{T^\*\_0}(d\mathbf{x}) = F\_s^{T^\*}(d\mathbf{x}). \end{cases} \tag{24}$$

Setting Λ*<sup>i</sup>* (*s*) := *<sup>i</sup> <sup>j</sup>*=<sup>1</sup> λ(*s*, *T* <sup>∗</sup> *<sup>j</sup>* ), we conclude that for all *i* ∈ {1,..., *n*}

$$W\_t^{T\_i^\*} = W\_t^{T^\*} - \int\_0^t \sqrt{c\_s} \Lambda^i(s) ds \tag{25}$$

and

$$F\_s^{T^\*\_{\boldsymbol{\cdot}}}(d\boldsymbol{x}) = \exp\left(\boldsymbol{x}\boldsymbol{A}^{\boldsymbol{\cdot}}(\boldsymbol{s})\right) F\_s^{T^\*}(d\boldsymbol{x}).\tag{26}$$

Note that the coefficients <sup>√</sup>*cs*Λ*<sup>i</sup>* (*s*) and exp(*x*Λ*<sup>i</sup>* (*s*)), which appear in this measure change, are deterministic functions and therefore the measure change is structure preserving, i.e.the driving process is still a time-inhomogeneous Lévy process after the measure change.

Since the forward process *F*(·, *T* <sup>∗</sup> *<sup>i</sup>* , *T* <sup>∗</sup> *<sup>i</sup>*−<sup>1</sup>) is by construction a <sup>P</sup>*<sup>T</sup>* <sup>∗</sup> *i*−1 -martingale, the process *<sup>F</sup>*(·,*<sup>T</sup>* <sup>∗</sup> *<sup>i</sup>* ,*T* <sup>∗</sup> *<sup>i</sup>*−1) *F*(0,*T* ∗ *<sup>i</sup>* ,*T* <sup>∗</sup> *<sup>i</sup>*−1), which is the density process

$$\left. \frac{d\mathbb{P}\_{T\_i^\*}}{d\mathbb{P}\_{T\_{i-1}^\*}} \right|\_{\mathcal{P}\_t} = \frac{F(t, T\_i^\*, T\_{i-1}^\*)}{F(0, T\_i^\*, T\_{i-1}^\*)} \tag{27}$$

is a P*<sup>T</sup>* <sup>∗</sup> *i*−1 -martingale as well. By iterating the relation (21) we get on *F<sup>T</sup>* <sup>∗</sup> *i*−1

$$\frac{d\mathbb{P}\_{T\_{i-1}^\*}}{d\mathbb{P}\_{T^\*}} = \frac{B(0, T^\*)}{B(0, T\_{i-1}^\*)} \prod\_{j=1}^{i-1} F(T\_{i-1}^\*, T\_j^\*, T\_{j-1}^\*)$$

$$= \exp\left(\sum\_{j=1}^{i-1} \int\_0^{T\_{i-1}^\*} \lambda(s, T\_j^\*) dL\_s^{T\_{j-1}^\*}\right). \tag{28}$$

Applying Proposition III.3.8 of Jacod and Shiryaev (1987) [13], we see that its restriction to *F<sup>t</sup>* for *t* ∈ [0, *T* <sup>∗</sup> *i* ]

$$\frac{d\mathbb{P}\_{T\_i^\*}}{d\mathbb{P}\_{T^\*}}\Big|\_{\mathcal{F}\_l} = \frac{B(0, T^\*)}{B(0, T\_i^\*)} \prod\_{j=1}^i F(t, T\_j^\*, T\_{j-1}^\*) \tag{29}$$

is a P*<sup>T</sup>* <sup>∗</sup> -martingale.

# **3 Fourier-Based Methods for Option Pricing**

We will derive an explicit valuation formula for standard interest rate derivatives such as caps and floors in the Lévy forward process model. Since floor prices can be derived from the corresponding put-call-parity relation we concentrate on caps. Recall that a cap is a sequence of call options on subsequent LIBOR rates. Each single option is called a caplet. The payoff of a caplet with strike rate *K* and maturity *T* <sup>∗</sup> *<sup>i</sup>* is

$$
\delta\_i^\* \left( L(T\_i^\*, T\_i^\*) - K \right)^+,\tag{30}
$$

where the payment is made at time point *T* <sup>∗</sup> *<sup>i</sup>*−<sup>1</sup>. The forward LIBOR rates *L*(*T* <sup>∗</sup> *<sup>i</sup>* , *T* <sup>∗</sup> *i* ) are the discretely compounded, annualized interest rates which can be earned from investment during a future interval starting at *T* <sup>∗</sup> *<sup>i</sup>* and ending at *T* <sup>∗</sup> *<sup>i</sup>*−<sup>1</sup> considered at the time point *T* <sup>∗</sup> *<sup>i</sup>* . These rates can be expressed in terms of the forward prices as follows

$$L(T\_i^\*, T\_i^\*) = \frac{1}{\delta\_i^\*} \left( F(T\_i^\*, T\_i^\*, T\_{i-1}^\*) - 1 \right). \tag{31}$$

Its time-0-price, denoted by *Cplt*0(*T* <sup>∗</sup> *<sup>i</sup>* , *K*), is given by

$$\text{Cplt}\_0(T\_i^\*, K) = B(0, T\_{i-1}^\*) \delta\_i^\* \mathbb{E}\_{\mathbb{P}\_{T\_{i-1}^\*}} \left[ \left( L(T\_i^\*, T\_i^\*) - K \right)^+ \right]. \tag{32}$$

Instead of basing the pricing on the Lévy LIBOR model one can use the Lévy forward process approach (see Eberlein and Özkan (2005) [5]). It is then more natural to write the pricing formula (32) in the form

$$Cplt\_0(T\_i^\*, K) = B(0, T\_{i-1}^\*) \mathbb{E}\_{\mathbb{P}\_{T\_{i-1}^\*}} \left[ \left( F(T\_i^\*, T\_i^\*, T\_{i-1}^\*) - \tilde{K}\_i \right)^+ \right],\tag{33}$$

where *K <sup>i</sup>* := <sup>1</sup> <sup>+</sup> <sup>δ</sup><sup>∗</sup> *<sup>i</sup> K*. From (17), the forward process *F*(·, *T* <sup>∗</sup> *<sup>i</sup>* , *T* <sup>∗</sup> *<sup>i</sup>*−<sup>1</sup>) is given by

$$\begin{split} F(T\_i^\*, T\_i^\*, T\_{i-1}^\*) &= F(0, T\_i^\*, T\_{i-1}^\*) \exp\left(\int\_0^{T\_i^\*} b\_s^{T\_{i-1}^\*} \lambda(s, T\_i^\*) ds\right) \\ &\times \exp\left(\int\_0^{T\_i^\*} \sqrt{c\_s} \lambda(s, T\_i^\*) dW\_s^{T\_{i-1}^\*}\right) \\ &\times \exp\left(\int\_0^{T\_i^\*} \int\_{\mathbb{R}} x \lambda(s, T\_i^\*) \widetilde{\mu}\_{i-1}^{L^{T^\*}}(ds, dx)\right). \end{split} \tag{34}$$

Using the relations (25) and (26) we obtain for *t* ∈ [0, *T* <sup>∗</sup> *i* ]

$$F(t, T\_l^\*, T\_{l-1}^\*) = F(0, T\_l^\*, T\_{l-1}^\*) \exp\left(\int\_0^t \lambda(s, T\_l^\*) dL\_s^{T^\*} + d(t, T\_l^\*)\right), \tag{35}$$

Option Pricing and Sensitivity Analysis … 295

where

$$d(t, T\_i^\*) = \int\_0^t \lambda(s, T\_i^\*) \left[ b\_s^{T\_{i-1}^\*} - b\_s^{T^\*} - A^{i-1}(s) c\_s \right] ds$$

$$- \int\_0^t \lambda(s, T\_i^\*) \int\_{\mathbb{R}} \chi \left( e^{\chi A^{i-1}(s)} - 1 \right) F\_s^{T^\*}(d\chi) ds. \tag{36}$$

Remember that on *F<sup>T</sup>* <sup>∗</sup> *i*−1

$$\frac{d\mathbb{P}\_{T\_{i-1}^\*}}{d\mathbb{P}\_{T^\*}} = \exp\left(\sum\_{j=1}^{i-1} \int\_0^{T\_{i-1}^\*} \lambda(s, T\_j^\*) dL\_s^{T^\*} + \sum\_{j=1}^{i-1} d(T\_{i-1}^\*, T\_j^\*)\right). \tag{37}$$

Keeping in mind Assumption 2.2 (LR.1), we find

$$\exp\left(-\sum\_{j=1}^{i-1} d(T\_{i-1}^\*, T\_j^\*)\right) = \mathbb{E}\_{\mathbb{P}^\*} \left[ \exp\left(\int\_0^{T\_{i-1}^\*} \Lambda^{i-1}(s) dL\_s^{T^\*} \right) \right].\tag{38}$$

Using Proposition 8 in Eberlein and Kluge (2006) [6], we find

$$\exp\left(-\sum\_{j=1}^{i-1} d(T\_{i-1}^\*, T\_j^\*)\right) = \exp\left(\int\_0^{T\_{i-1}^\*} \theta\_s\left(\Lambda^{i-1}(s)\right)ds\right). \tag{39}$$

Consequently,

$$\frac{d\mathbb{P}\_{T\_{l-1}^\*}}{d\mathbb{P}\_{T^\*}} = \exp\left(\int\_0^{T\_{l-1}^\*} \Lambda^{i-1}(s) dL\_s^{T^\*} - \int\_0^{T\_{l-1}^\*} \theta\_s\left(\Lambda^{i-1}(s)\right) ds\right). \tag{40}$$

Knowing that the process - *F*(·,*T* <sup>∗</sup> *<sup>i</sup>* ,*T* <sup>∗</sup> *<sup>i</sup>*−1) *F*(0,*T* ∗ *<sup>i</sup>* ,*T* <sup>∗</sup> *<sup>i</sup>*−1) is a P*<sup>T</sup>* <sup>∗</sup> *i*−1 -martingale, we reach

$$\exp(-d(T\_i^\*, T\_i^\*)) = \mathbb{E}\_{\mathbb{P}\_{T\_{-1}^\*}} \left[ \exp\left(\int\_0^{T\_i^\*} \lambda(\mathbf{s}, T\_i^\*) dL\_s^{T^\*}\right) \right]. \tag{41}$$

Hence,

$$\begin{split} & \exp(-d(T\_i^\*, T\_i^\*)) \\ &= \exp\left(-\int\_0^{T\_i^\*} \theta\_s\left(\Lambda^{i-1}(s)\right)ds\right) \mathbb{E}\_{\mathbb{P}\_{T^\*}}\left[\exp\left(\int\_0^{T\_i^\*} \Lambda^i(s)dL\_s^{T^\*}\right)\right] \\ &= \exp\left(\int\_0^{T\_i^\*} \left[\theta\_s\left(\Lambda^i(s)\right) - \theta\_s\left(\Lambda^{i-1}(s)\right)\right]ds\right). \end{split} \tag{42}$$

Thus,

$$d(T\_i^\*, T\_i^\*) = \int\_0^{T\_i^\*} \left[ -\theta\_s \left( \Lambda^i(s) \right) + \theta\_s \left( \Lambda^{i-1}(s) \right) \right] ds. \tag{43}$$

Define the random variable *XT* <sup>∗</sup> *<sup>i</sup>* as the logarithm of *F*(*T* <sup>∗</sup> *<sup>i</sup>* , *T* <sup>∗</sup> *<sup>i</sup>* , *T* <sup>∗</sup> *<sup>i</sup>*−1). Therefore,

$$X\_{T\_i^\*} = \ln\left(F(0, T\_i^\*, T\_{i-1}^\*)\right) + \int\_0^{T\_i^\*} \lambda(s, T\_i^\*) dL\_s^{T^\*} + d(T\_i^\*, T\_i^\*).\tag{44}$$

**Proposition 3.1** *Suppose there is a real number R* ∈ (1, 1 + ε) *such that the moment-generating function of XT* <sup>∗</sup> *<sup>i</sup> with respect to* P*<sup>T</sup>* <sup>∗</sup> *<sup>i</sup>*−<sup>1</sup> *is finite at R, i.e. MXT* <sup>∗</sup> *i* (*R*) < ∞*, then*

$$\mathrm{Cpl}\_{0}(T\_{i}^{\*},K) = \frac{\tilde{K}\_{i}B(0,T\_{i-1}^{\*})}{2\pi} \int\_{\mathbb{R}} \left\{ \left( \frac{F(0,T\_{i}^{\*},T\_{i-1}^{\*})}{\tilde{K}\_{i}} \right)^{R+iu} \right\}^{R+iu}$$

$$\times \exp\left( \int\_{0}^{T\_{i}^{\*}} \int\_{\mathbb{R}} e^{x\cdot\Lambda^{l-1}(x)} \left[ \left( e^{(R+iu)x\cdot\lambda(s,T\_{i}^{\*})} - 1 \right) - (R+iu) \left( e^{x\lambda(s,T\_{i}^{\*})} - 1 \right) \right] F\_{3}^{T^{\*}}(dx)ds \right)$$

$$\times \exp\left( \int\_{0}^{T\_{i}^{\*}} \frac{c\_{3}}{2} (R+iu)(R+iu-1)\lambda^{2}(s,T\_{i}^{\*}) ds \right) \bigg| \frac{du}{(R+iu)(R+iu-1)}.\tag{45}$$

*Proof* The time-0-price of the caplet with strike rate *K* and maturity *T* <sup>∗</sup> *<sup>i</sup>* has the form

$$\begin{split} \mathbb{E}\text{P}l t\_0(T\_i^\*, K) &= B(0, T\_{i-1}^\*) \mathbb{E}\_{\mathbb{P}\_{T\_{i-1}^\*}^\*} \left[ \left( e^{X\_{T\_i^\*}} - \tilde{K}\_i \right)^+ \right] \\ &= B(0, T\_{i-1}^\*) \mathbb{E}\_{\mathbb{P}\_{T\_{i-1}^\*}^\*} \left[ f \left( X\_{T\_i^\*} \right) \right], \end{split} \tag{46}$$

where the function *f* : R → R<sup>+</sup> is defined by *f* (*x*) = (*e<sup>x</sup>* − *K <sup>i</sup>*)+.

Applying Theorem 2.2 in Eberlein et al. (2010) [8] (by the definition of *XT* <sup>∗</sup> *<sup>i</sup>* we have *s* = 0 here), we get

$$Cplt\_0(T\_i^\*, K) = \frac{B(0, T\_{i-1}^\*)}{2\pi} \int\_{\mathbb{R}} M\_{X\_{\mathbb{Z}\_i^\*}}(R + \mathrm{i}u) \hat{f}(-u + \mathrm{i}R) du,\tag{47}$$

where the Fourier transform ˆ*f* is given by

$$\hat{f}(-\mathbf{u} + \mathbf{i}R) = \frac{\widetilde{K}\_i^{1-R-\mathrm{i}\mu}}{(R+\mathrm{i}\mu)(R+\mathrm{i}\mu-1)}\tag{48}$$

and the moment-generating function *MXT* <sup>∗</sup> *i* is given by

$$\begin{split} M\_{X\_{\boldsymbol{T}\_{i}^{\*}}}(R+\mathrm{i}\boldsymbol{u}) &= \mathbb{E}\_{\mathbb{P}\_{\boldsymbol{T}\_{i}^{\*}}} \left[ \exp \left( (\mathcal{R}+\mathrm{i}\boldsymbol{u}) X\_{\boldsymbol{T}\_{i}^{\*}} \right) \right] \\ &= \left( F(0, T\_{i}^{\*}, T\_{i-1}^{\*}) \right)^{R+\mathrm{i}\boldsymbol{u}} \exp \left( (R+\mathrm{i}\boldsymbol{u}) d(T\_{i}^{\*}, T\_{i}^{\*}) \right) \\ &\times \mathbb{E}\_{\mathbb{P}\_{\boldsymbol{T}\_{i}^{\*}}} \left[ \exp \left( \int\_{0}^{T\_{i}^{\*}} (R+\mathrm{i}\boldsymbol{u}) \lambda(\boldsymbol{s}, T\_{i}^{\*}) dL\_{s}^{T^{\*}} \right) \right]. \end{split} \tag{49}$$

Making a change of measure, we find

$$M\_{X\_{\overline{I}\_i}}(R+\mathrm{i}\mu) = \left(F(0, T\_i^\*, T\_{i-1}^\*)\right)^{R+\mathrm{i}\mu} \exp\left((R+\mathrm{i}\mu)d(T\_i^\*, T\_i^\*)\right)$$

$$\times \frac{\mathbb{E}\_{\mathbb{P}\_{I^\*}}\left[\exp\left(\int\_0^{T\_i^\*} \left((R+\mathrm{i}\mu)\lambda(s, T\_i^\*) + \Lambda^{i-1}(s)\right)dL\_s^{T^\*}\right)\right]}{\mathbb{E}\_{\mathbb{P}\_{I^\*}}\left[\exp\left(\int\_0^{T\_i^\*} \Lambda^{i-1}(s)dL\_s^{T^\*}\right)\right]}. \tag{50}$$

Using Proposition 8 in Eberlein and Kluge (2006) [6], we can prove easily that

$$\begin{split}M\_{T\_{l}^{\star}}(R+\mathrm{i}\boldsymbol{u}) &= \left(F(0,T\_{i}^{\star},T\_{i-1}^{\star})\right)^{R+\mathrm{i}\boldsymbol{u}} \\ &\times \exp\Big((R+\mathrm{i}\boldsymbol{u})\int\_{0}^{T\_{l}^{\star}} \left[-\theta\_{3}\left(\boldsymbol{\Lambda}^{i}(\boldsymbol{s})\right)+\theta\_{3}\left(\boldsymbol{\Lambda}^{i-1}(\boldsymbol{s})\right)\right]ds\Big) \\ &\times \frac{\exp\Big(\int\_{0}^{T\_{l}^{\star}}\theta\_{3}\left((R+\mathrm{i}\boldsymbol{u})\boldsymbol{\lambda}(\boldsymbol{s},T\_{l}^{\star})+\boldsymbol{\Lambda}^{i-1}(\boldsymbol{s})\right)ds\Big)}{\exp\left(\int\_{0}^{T\_{l}^{\star}}\theta\_{3}\left(\boldsymbol{\Lambda}^{i-1}(\boldsymbol{s})\right)ds\right)} \\ &= \left(F(0,T\_{i}^{\star},T\_{i-1}^{\star})\right)^{R+\mathrm{i}\boldsymbol{u}}\exp\left(\int\_{0}^{T\_{l}^{\star}}\theta\_{3}\left((R+\mathrm{i}\boldsymbol{u})\boldsymbol{\lambda}(\boldsymbol{s},T\_{l}^{\star})+\boldsymbol{\Lambda}^{i-1}(\boldsymbol{s})\right)ds\right) \\ &\times \exp\Big(\int\_{0}^{T\_{l}^{\star}}\left[(-R-\mathrm{i}\boldsymbol{u})\theta\_{3}\left(\boldsymbol{\Lambda}^{i}(\boldsymbol{s})\right)-(1-R-\mathrm{i}\boldsymbol{u})\theta\_{3}\left(\boldsymbol{\Lambda}^{i-1}(\boldsymbol{s})\right)\right]ds\Big). \end{split} \tag{51}$$

Taking into account the choice of the drift coefficient in (19), the cumulant function θ*<sup>s</sup>* (see (9)) and the moment-generating function *MXT* <sup>∗</sup> *i* , respectively, become

$$\theta\_s(R+\mathbf{i}u) = (R+\mathbf{i}u)\int\_{\mathbb{R}} \left(\frac{e^{\mathbf{r}(R+\mathbf{i}u)} - 1}{R+\mathbf{i}u} - \frac{(e^{\mathbf{r}\lambda(s,T\_1^\*)} - 1)}{\lambda(s,T\_1^\*)}\right) F\_s^{T^\*}(d\mathbf{x})$$

$$+ \frac{c\_s}{2}(R+\mathbf{i}u)\left(R+\mathbf{i}u - \lambda(s,T\_1^\*)\right) \tag{52}$$

and

$$M\_{X\_{T\_{\hat{l}}}}(R+\mathrm{i}u) = \left(F(0,T\_{\hat{l}}^{\ast},T\_{\hat{l}-1}^{\ast})\right)^{R+\mathrm{i}u} \exp\left(\int\_{0}^{T\_{\hat{l}}^{\ast}} \frac{c\_{\mathrm{s}}}{2} (R+\mathrm{i}u)(R+\mathrm{i}u-1)\lambda^{2}(s,T\_{\hat{l}}^{\ast})ds\right)$$

298 E. Eberlein et al.

$$\times \exp\left(\int\_0^{T\_l^\bullet} \int\_{\mathbb{R}} e^{x\lambda^{\hat{l}-1}(s)} \left(e^{(\mathcal{R}+\mathrm{i}u)\ge\lambda(\mathbf{x},T\_l^\bullet)} - 1\right) F\_s^{T^\bullet}(dx)ds\right)$$

$$\times \exp\left(-(\mathcal{R}+\mathrm{i}u)\int\_0^{T\_l^\bullet} \int\_{\mathbb{R}} e^{x\lambda^{\hat{l}-1}(s)} \left(e^{x\lambda(\mathbf{x},T\_l^\bullet)} - 1\right) F\_s^{T^\bullet}(dx)ds\right). \tag{53}$$

Finally, from (48) and (53) we conclude that

$$\mathbb{C}pl\_{0}(T\_{i}^{\*},K) = \frac{\tilde{K}\_{i}B(0,T\_{i-1}^{\*})}{2\pi} \int\_{\mathbb{R}} \left\{ \left( \frac{F(0,T\_{i}^{\*},T\_{i-1}^{\*})}{\tilde{K}\_{i}} \right)^{R+\text{i}u} \right.$$

$$\times \exp\left( \int\_{0}^{T\_{i}^{\*}} \int\_{\mathbb{R}} e^{x\cdot\Lambda^{i-1}(x)} \left[ \left( e^{(R+\text{i}u)x\cdot\lambda(x,T\_{i}^{\*})} - 1 \right) - (R+\text{i}u) \left( e^{x\cdot\lambda(x,T\_{i}^{\*})} - 1 \right) \right] F\_{x}^{T^{\*}}(dx)ds \right)$$

$$\times \exp\left( \int\_{0}^{T\_{i}^{\*}} \frac{c\_{3}}{2} (R+\text{i}u)(R+\text{i}u-1)\lambda^{2}(s,T\_{i}^{\*}) ds \right) \bigg| \frac{du}{(R+\text{i}u)(R+\text{i}u-1)}.\tag{54}$$

# **4 Sensitivity Analysis**

# *4.1 Greeks Computed by the Malliavin Approach*

In this section we present an application of the Malliavin calculus to the computation of Greeks within the Lévy forward process model. We refer to the literature, for example Di Nunno et al. (2008) [2] as well as Nualart (2006) [15] for details on the theoretical aspects of Malliavin calculus. Another important reference is Yablonski (2008) [19]. See also the Appendix for a short presentation of definitions and results used in the sequel. The forward process *F*(*t*, *T* <sup>∗</sup> *<sup>i</sup>* , *T* <sup>∗</sup> *<sup>i</sup>*−<sup>1</sup>) under the forward measures P*<sup>T</sup>* <sup>∗</sup> *<sup>i</sup>*−<sup>1</sup> can be written as stochastic exponential

$$F(t, T\_i^\*, T\_{i-1}^\*) = F(0, T\_i^\*, T\_{i-1}^\*) \beta\_t^\diamond \left( Z(\cdot, T\_i^\*) \right) \tag{55}$$

with

$$Z(\mathbf{t}, T\_i^\*) = \int\_0^t \sqrt{c\_s} \lambda(\mathbf{s}, T\_i^\*) d\mathbf{W}\_s^{T\_{i-1}^\*} + \int\_0^t \int\_{\mathbb{R}} (e^{\mathbf{x}\lambda(\mathbf{s}, T\_i^\*)} - 1) \tilde{\mu}^{L^{T\_{i-1}^\*}}(ds, d\mathbf{x}). \tag{56}$$

Expressed in a differential form we get the P*<sup>T</sup>* <sup>∗</sup> *i*−1 -dynamics

$$\frac{dF(t, T\_i^\*, T\_{i-1}^\*)}{F(t-, T\_i^\*, T\_{i-1}^\*)} = \sqrt{c\_t} \lambda(t, T\_i^\*) dW\_t^{T\_{i-1}^\*} + \int\_{\mathbb{R}} (e^{\mathbf{x}\lambda(t, T\_i^\*)} - 1) \widetilde{\mu}^{L^{T\_{i-1}^\*}}(dt, dx), \quad (57)$$

where *F*(*t*−, *T* <sup>∗</sup> *<sup>i</sup>* , *T* <sup>∗</sup> *<sup>i</sup>*−<sup>1</sup>) is the pathwise left limit of *F*(·, *T* <sup>∗</sup> *<sup>i</sup>* , *T* <sup>∗</sup> *<sup>i</sup>*−<sup>1</sup>) at the point *t*.

As in the classical Malliavin calculus we are able to associate the solution of (57) with the process *Y* (*t*, *T* <sup>∗</sup> *<sup>i</sup>* , *T* <sup>∗</sup> *<sup>i</sup>*−1) := ∂*F*(*t*,*T* ∗ *<sup>i</sup>* ,*T* <sup>∗</sup> *<sup>i</sup>*−1) ∂*F*(0,*T* ∗ *<sup>i</sup>* ,*T* <sup>∗</sup> *<sup>i</sup>*−1); called the first variation process of *F*(*t*, *T* <sup>∗</sup> *<sup>i</sup>* , *T* <sup>∗</sup> *<sup>i</sup>*−1). The following proposition provides a simpler expression for the Malliavin derivative operator *Dr*,<sup>0</sup> when applied to the forward process rates *F*(*t*, *T* <sup>∗</sup> *<sup>i</sup>* , *T* <sup>∗</sup> *<sup>i</sup>*−1) (see Di Nunno et al. (2008) [2], Theorem 17.4 and Yablonski (2008) [19], Definition 17. for details). We will denote the domain of the operator *Dr*,<sup>0</sup> in *L*2(Ω) by D1,2, meaning that D1,<sup>2</sup> is the closure of the class of smooth random variables *S* (see (100) in the Appendix).

**Proposition 4.1** *Let F*(*t*, *T* <sup>∗</sup> *<sup>i</sup>* , *T* <sup>∗</sup> *<sup>i</sup>*−<sup>1</sup>)*<sup>t</sup>*∈[0,*<sup>T</sup>* ∗] *be the solution of (57). Then F*(*t*, *T* <sup>∗</sup> *i* , *T* ∗ *<sup>i</sup>*−<sup>1</sup>) <sup>∈</sup> <sup>D</sup>1,<sup>2</sup> *and the Malliavin derivative is given by*

$$\begin{split} &D\_{r,0}F(t, T\_i^\*, T\_{i-1}^\*) \\ &= Y(t, T\_i^\*, T\_{i-1}^\*)Y(r-, T\_i^\*, T\_{i-1}^\*)^{-1} F(r-, T\_i^\*, T\_{i-1}^\*) \lambda(r, T\_i^\*) \sqrt{c\_r} \mathbf{1}\_{\{r \le t\}}. \end{split} \tag{58}$$

#### **4.1.1 Variation in the Initial Forward Price**

In this section, we provide an expression for the *Delta*, the partial derivative of the expectation *Cplt*0(*T* <sup>∗</sup> *<sup>i</sup>* , *K*) with respect to the initial condition *F*(0, *T* <sup>∗</sup> *<sup>i</sup>* , *T* <sup>∗</sup> *<sup>i</sup>*−<sup>1</sup>) which is given by

$$\Delta(F(0, T\_i^\*, T\_{i-1}^\*)) = \frac{\partial Cplt\_0(T\_i^\*, K)}{\partial F(0, T\_i^\*, T\_{i-1}^\*)}.\tag{59}$$

The derivative with respect to the initial LIBOR rate is then an easy consequence.

$$\begin{split} \Delta(L(0, T\_i^\*)) &= \frac{\partial Cplt\_0(T\_i^\*, K)}{\partial L(0, T\_i^\*)} \\ &= \Delta(F(0, T\_i^\*, T\_{i-1}^\*)) \frac{\partial F(0, T\_i^\*, T\_{i-1}^\*)}{\partial L(0, T\_i^\*)} \\ &= \delta\_i^\* \Delta(F(0, T\_i^\*, T\_{i-1}^\*)), \end{split} \tag{60}$$

since

$$L(0, T\_i^\*) = \frac{1}{\delta\_i^\*} \left( F(0, T\_i^\*, T\_{i-1}^\*) - 1 \right). \tag{61}$$

Let us define the set

$$\tilde{T}\_i = \left\{ h\_i \in L^2([0, T\_i^\*]) : \int\_0^{T\_i^\*} h\_i(u) du = 1 \right\}.\tag{62}$$

**Proposition 4.2** *For all functions hi* ∈ *T <sup>i</sup> , we have*

$$\begin{split} \Delta(F(0, T\_i^\*, T\_{i-1}^\*)) &= \frac{B(0, T\_{i-1}^\*)}{F(0, T\_i^\*, T\_{i-1}^\*)} \mathbb{E}\_{\mathbb{P}\_{T^\*}} \left[ \left( F(T\_i^\*, T\_i^\*, T\_{i-1}^\*) - \widetilde{K}\_i \right)^+ \\ &\times \exp\left( \int\_0^{T\_i^\*} A^{i-1}(s) dL\_s^{T^\*} - \int\_0^{T\_i^\*} \theta\_s \left( A^{i-1}(s) \right) ds \right) \\ &\times \left( \int\_0^{T\_i^\*} \frac{h\_i(u) dW\_u^{T^\*}}{\lambda(u, T\_i^\*) \sqrt{c\_u}} - \int\_0^{T\_i^\*} \frac{h\_i(u) A^{i-1}(u) du}{\lambda(u, T\_i^\*)} \right) . \end{split} (63)$$

*Proof* We consider a more general payoff of the form *H*(*F*(*T* <sup>∗</sup> *<sup>i</sup>* , *T* <sup>∗</sup> *<sup>i</sup>* , *T* <sup>∗</sup> *<sup>i</sup>*−<sup>1</sup>)) such that *H* : R −→ R is a locally integrable function satisfying

$$\mathbb{E}\_{\mathbb{P}\_{T\_{i-1}^\*}^\*} \left[ H(F(T\_i^\*, T\_i^\*, T\_{i-1}^\*))^2 \right] < \infty \cdot \tag{64}$$

First, assume that *H* is a continuously differentiable function with compact support. Then we can differentiate inside the expectation and get

$$\begin{split} \Delta\_{H}(F(0, T\_{i}^{\*}, T\_{i-1}^{\*})) &:= \frac{\partial \mathbb{E}\_{\mathbb{P}\_{\mathbb{T}\_{-1}^{\*}}}[H(F(T\_{i}^{\*}, T\_{i}^{\*}, T\_{i-1}^{\*}))]}{\partial F(0, T\_{i}^{\*}, T\_{i-1}^{\*})} \\ &= \mathbb{E}\_{\mathbb{P}\_{\mathbb{T}\_{-1}^{\*}}} \left[ H'(F(T\_{i}^{\*}, T\_{i}^{\*}, T\_{i-1}^{\*})) \frac{\partial F(T\_{i}^{\*}, T\_{i}^{\*}, T\_{i-1}^{\*})}{\partial F(0, T\_{i}^{\*}, T\_{i-1}^{\*})} \right] \\ &= \mathbb{E}\_{\mathbb{P}\_{\mathbb{T}\_{-1}^{\*}}} \left[ H'(F(T\_{i}^{\*}, T\_{i}^{\*}, T\_{i-1}^{\*})) Y(T\_{i}^{\*}, T\_{i}^{\*}, T\_{i-1}^{\*}) \right]. \end{split} (65) \end{split}$$

Using Proposition 4.1 we find for any *hi* ∈ *T i*

$$Y(T\_i^\*, T\_i^\*, T\_{i-1}^\*) = \int\_0^{T\_i^\*} D\_{u,0} F(T\_i^\*, T\_i^\*, T\_{i-1}^\*) \frac{h\_i(u)Y(u-, T\_i^\*, T\_{i-1}^\*) du}{F(u-, T\_i^\*, T\_{i-1}^\*) \lambda(u, T\_i^\*) \sqrt{c\_u}}.\tag{66}$$

From the chain rule (see Yablonski (2008) [19], Proposition 4.8) we find

$$\begin{split} \Delta\_{H}(F(0, T\_{i}^{\*}, T\_{i-1}^{\*})) &= \mathbb{E}\_{\mathbb{P}\_{\mathbb{T}\_{-1}^{\*}}} \Big[ \int\_{0}^{T\_{i}^{\*}} H'(F(T\_{i}^{\*}, T\_{i}^{\*}, T\_{i-1}^{\*})) D\_{u,0} F(T\_{i}^{\*}, T\_{i}^{\*}, T\_{i-1}^{\*}) \\ & \times \frac{h\_{i}(u) Y(u -, T\_{i}^{\*}, T\_{i-1}^{\*}) du}{F(u -, T\_{i}^{\*}, T\_{i-1}^{\*}) \lambda(u, T\_{i}^{\*}) \sqrt{c\_{u}}} \Big] \\ &= \mathbb{E}\_{\mathbb{P}\_{\mathbb{T}\_{-1}^{\*}}} \Big[ \int\_{0}^{T\_{i}^{\*}} D\_{u,0} H(F(T\_{i}^{\*}, T\_{i}^{\*}, T\_{i-1}^{\*})) \\ & \times \frac{h\_{i}(u) Y(u -, T\_{i}^{\*}, T\_{i-1}^{\*}) du}{F(u -, T\_{i}^{\*}, T\_{i-1}^{\*}) \lambda(u, T\_{i}^{\*}) \sqrt{c\_{u}}} \Big] \end{split}$$

Option Pricing and Sensitivity Analysis … 301

$$\begin{split} \mathcal{I} = \mathbb{E}\_{\mathbb{P}\_{T\_{-1}^\*}} \Big[ \int\_0^{T\_i^\*} \int\_{\mathbb{R}} D\_{u,x} H(F(T\_i^\*, T\_i^\*, T\_{i-1}^\*)) \\ \quad \times \frac{h\_i(u)Y(u-, T\_i^\*, T\_{i-1}^\*) du \delta\_0(dx)}{F(u-, T\_i^\*, T\_{i-1}^\*) \lambda(u, T\_i^\*) \sqrt{c\_u}} \Big], \end{split} \tag{67}$$

where δ0(*dx*) is the Dirac measure at 0.

By the definition of the Skorohod integral δ(·)(see Yablonski (2008) [19], Sect. 5), we reach

$$\begin{aligned} \Delta\_H(F(0, T\_i^\*, T\_{i-1}^\*)) \\ = \mathbb{E}\_{\mathbb{P}\_{T\_{i-1}^\*}} \left[ H(F(T\_i^\*, T\_i^\*, T\_{i-1}^\*)) \delta \left( \frac{h\_i(\cdot) Y(\cdot -, T\_i^\*, T\_{i-1}^\*) \delta\_0(\cdot)}{F(\cdot -, T\_i^\*, T\_{i-1}^\*) \lambda(\cdot, T\_i^\*) \sqrt{c\_\cdot}} \right) \right] . \end{aligned} (68)$$

However, the stochastic process

$$\left(\frac{h\_i(u)Y(u-,T\_i^\*,T\_{i-1}^\*)}{F(u-,T\_i^\*,T\_{i-1}^\*)\lambda(u,T\_i^\*)\sqrt{c\_u}}\right)\_{0\le u\le T\_i^\*}\tag{69}$$

is a predictable process, thus the Skorohod integral coincides with the Itô stochastic integral and we get

$$\begin{aligned} \Delta\_H(F(0, T\_i^\*, T\_{i-1}^\*)) \\ = \mathbb{E}\_{\mathbb{P}\_{T\_{i-1}^\*}} \left[ H(F(T\_i^\*, T\_i^\*, T\_{i-1}^\*)) \int\_0^{T\_i^\*} \frac{h\_i(u)Y(u-, T\_i^\*, T\_{i-1}^\*)dW\_u^{T\_{i-1}^\*}}{F(u-, T\_i^\*, T\_{i-1}^\*) \lambda(u, T\_i^\*) \sqrt{c\_u}} \right]. \end{aligned} \tag{70}$$

By Lemma 12.28. p. 208 in Di Nunno et al. (2008) [2] the result (70) holds for any locally integrable function *H* such that

$$\mathbb{E}\_{\mathbb{P}\_{T\_{i-1}^\*}^\*} \left[ H(F(T\_i^\*, T\_i^\*, T\_{i-1}^\*))^2 \right] < \infty. \tag{71}$$

In particular, if one takes

$$H(F(T\_i^\*, T\_i^\*, T\_{i-1}^\*)) = B(0, T\_{i-1}^\*) \left( F(T\_i^\*, T\_i^\*, T\_{i-1}^\*) - \tilde{K}\_i \right)^+,\tag{72}$$

we can express the derivatives of the expectation *Cplt*0(*T* <sup>∗</sup> *<sup>i</sup>* , *K*, δ<sup>∗</sup> *<sup>i</sup>* ) with respect to the initial condition *F*(0, *T* <sup>∗</sup> *<sup>i</sup>* , *T* <sup>∗</sup> *<sup>i</sup>*−<sup>1</sup>) in the form of a weighted expectation as follows

$$\begin{split} \Delta(F(0, T\_i^\*, T\_{i-1}^\*)) &= B(0, T\_{i-1}^\*) \mathbb{E}\_{\mathbb{P}\_{T\_{i-1}^\*}^\*} \Bigg[ \left( F(T\_i^\*, T\_i^\*, T\_{i-1}^\*) - \widetilde{K}\_i \right)^+ \\ &\times \int\_0^{T\_i^\*} \frac{h\_i(u) Y(u-, T\_i^\*, T\_{i-1}^\*) dW\_u^{T\_{i-1}^\*}}{\lambda(u, T\_i^\*) \sqrt{c\_u} F(u-, T\_i^\*, T\_{i-1}^\*)} \Bigg]. \end{split} \tag{73}$$

We show easily that

$$Y(\mu - , T\_i^\* \ , T\_{i-1}^\*) = \frac{F(\mu - , T\_i^\* \ , T\_{i-1}^\*)}{F(0, T\_i^\* \ , T\_{i-1}^\*)},\tag{74}$$

hence

$$\begin{aligned} \Delta(F(0, T\_i^\*, T\_{i-1}^\*)) \\ = \frac{B(0, T\_{i-1}^\*)}{F(0, T\_i^\*, T\_{i-1}^\*)} \mathbb{E}\_{\mathbb{P}\_{T\_{i-1}^\*}} \left[ \left( F(T\_i^\*, T\_i^\*, T\_{i-1}^\*) - \widetilde{K}\_i \right)^+ \int\_0^{T\_i^\*} \frac{h\_i(u) dW\_u^{T\_{i-1}^\*}}{\lambda(u, T\_i^\*) \sqrt{c\_u}} \right] . \end{aligned} (75)$$

In accordance with (25) we can write

$$W\_t^{T\_{l-1}^\*} = W\_t^{T^\*} - \int\_0^t \Lambda^{i-1}(s)\sqrt{c\_s}ds. \tag{76}$$

By making a measure change using the fact (see (40)) that

$$\frac{d\mathbb{P}\_{T\_{l-1}^\*}}{d\mathbb{P}\_{T^\*}}\Big|\_{\mathcal{F}\_{T\_l^\*}} = \exp\left(\int\_0^{T\_l^\*} A^{i-1}(s) dL\_s^{T^\*} - \int\_0^{T\_l^\*} \theta\_s \left(A^{i-1}(s)\right) ds\right),\tag{77}$$

we end up with

$$\begin{split} \Delta(F(0, T\_i^\*, T\_{i-1}^\*)) &= \frac{B(0, T\_{i-1}^\*)}{F(0, T\_i^\*, T\_{i-1}^\*)} \mathbb{E}\_{\mathbb{P}\_{T^\*}} \left[ \left( F(T\_i^\*, T\_i^\*, T\_{i-1}^\*) - \widetilde{K}\_i \right)^+ \\ &\times \exp\left( \int\_0^{T\_i^\*} A^{i-1}(s) dL\_s^{T^\*} - \int\_0^{T\_i^\*} \theta\_s \left( A^{i-1}(s) \right) ds \right) \\ &\times \left( \int\_0^{T\_i^\*} \frac{h\_i(u) dW\_u^{T^\*}}{\lambda(u, T\_i^\*) \sqrt{c\_u}} - \int\_0^{T\_i^\*} \frac{h\_i(u) A^{i-1}(u)}{\lambda(u, T\_i^\*)} du \right) . \end{split} \tag{78}$$

# *4.2 Greeks Computed by the Fourier-Based Valuation Method*

Thanks to the Fourier-based valuation formula obtained in (45) and the structure of the forward process model as an exponential semimartingale, we can calculate readily the Greeks. We focus on the variation to the initial condition, i.e. Delta.

**Proposition 4.3** *Suppose there is a real number R* ∈ (1, 1 + ε) *such that the moment-generating function of XT* <sup>∗</sup> *<sup>i</sup> with respect to* P*<sup>T</sup>* <sup>∗</sup> *<sup>i</sup>*−<sup>1</sup> *is finite at R, i.e. MXT* <sup>∗</sup> *i* (*R*) < ∞*, then*

Option Pricing and Sensitivity Analysis … 303

$$\begin{split} \Delta(F(0, T\_{i}^{\*}, T\_{i-1}^{\*})) &= \frac{B(0, T\_{i-1}^{\*})}{2\pi} \int\_{\mathbb{R}} \left\{ \left( \frac{F(0, T\_{i}^{\*}, T\_{i-1}^{\*})}{\tilde{K}\_{i}} \right)^{R + \text{i}\text{u}} - 1 \right. \\ &\times \exp\left( \int\_{0}^{T\_{i}^{\*}} \int\_{\mathbb{R}} e^{x\lambda^{i-1}(s)} \left( e^{(R + \text{i}\text{u})\times\lambda(s, T\_{i}^{\*})} - 1 \right) F\_{s}^{T^{\*}}(dx) ds \right) \\ &\times \exp\left( - \int\_{0}^{T\_{i}^{\*}} \int\_{\mathbb{R}} e^{x\lambda^{i-1}(s)} (R + \text{i}\text{u}) \left( e^{x\lambda(s, T\_{i}^{\*})} - 1 \right) F\_{s}^{T^{\*}}(dx) ds \right) \\ &\times \exp\left( \int\_{0}^{T\_{i}^{\*}} \frac{c\_{s}}{2} (R + \text{i}\text{u}) (R + \text{i}\text{u} - 1) \lambda^{2}(s, T\_{i}^{\*}) ds \right) \Big| \frac{du}{R + \text{i}\text{u} - 1}. \end{split} (79)$$

*Proof* Based on the Sect. 4 in Eberlein et al. (2010) [8], this proposition can be shown easily.

# *4.3 Examples*

#### **4.3.1 Variance Gamma Process (VG)**

We suppose that the jump component of the driving process *L<sup>T</sup>* <sup>∗</sup> (see (8)) is described by a Variance Gamma process with the Lévy density ν given by

$$\nu(d\mathbf{x}) = F\_{VG}(\mathbf{x})d\mathbf{x} \tag{80}$$

such that

$$F\_{VG}(\mathbf{x}) := \frac{1}{\eta|\mathbf{x}|} \exp\left(\frac{\theta}{\sigma^2}\mathbf{x} - \frac{1}{\sigma}\sqrt{\frac{2}{\eta} + \frac{\theta^2}{\sigma^2}}|\mathbf{x}|\right),\tag{81}$$

where (θ , σ, η) are the parameters such that θ ∈ R, σ > 0 and η > 0.

Let us put *<sup>B</sup>* <sup>=</sup> <sup>θ</sup> <sup>σ</sup><sup>2</sup> and *<sup>C</sup>* <sup>=</sup> <sup>1</sup> σ \$<sup>2</sup> <sup>η</sup> <sup>+</sup> <sup>θ</sup> <sup>2</sup> <sup>σ</sup><sup>2</sup> and get

$$F\_{VG}(\mathbf{x}) = \frac{\exp\left(B\mathbf{x} - C|\mathbf{x}|\right)}{\eta|\mathbf{x}|}. \tag{82}$$

In this case, the moment-generating function *MXT* <sup>∗</sup> *i* is given by

$$M\_{X\_{T\_l^\*}}(z) = \left(F(0, T\_l^\*, T\_{l-1}^\*)\right)^z \exp\left(\int\_0^{T\_l^\*} \left(\frac{c\_s z}{2}(z-1)\lambda^2(s, T\_l^\*) + I^{VG}(s, z)\right) ds\right), \tag{83}$$

where the generalized integral *I V G*(*s*,*z*) is given by

$$I^{VG}(s,z) := \int\_{\mathbb{R}} \left( e^{\mathbf{x}\left(\varepsilon\lambda(s,T\_i^\*) + A^{i-1}(s)\right)} - e^{\mathbf{x}A^{i-1}(s)} \right) F\_{VG}(\mathbf{x}) d\mathbf{x}$$

$$- \int\_{\mathbb{R}} z \left( e^{\mathbf{x}A^i(s)} - e^{\mathbf{x}A^{i-1}(s)} \right) F\_{VG}(\mathbf{x}) d\mathbf{x}.\tag{84}$$

Now substituting *FV G*(*x*) by its explicit expression we get

$$\begin{split} I^{VG}(s,z) &= \int\_{\mathbb{R}} \left( e^{\mathbf{x}\left(z\lambda(s,T^\*\_{\boldsymbol{s}}) + A^{i-1}(s)\right)} - e^{\mathbf{x}A^{i-1}(s)} \right) \exp\left(B\mathbf{x} - C|\mathbf{x}|\right) \frac{dx}{\eta|\mathbf{x}|} \\ & \quad - \int\_{\mathbb{R}} z \left( e^{\mathbf{x}\Lambda^{i}(s)} - e^{\mathbf{x}\Lambda^{i-1}(s)} \right) \exp\left(B\mathbf{x} - C|\mathbf{x}|\right) \frac{dx}{\eta|\mathbf{x}|} \\ &= \int\_{0}^{+\infty} \left( e^{\mathbf{x}\left(z\lambda(s,T^\*\_{\boldsymbol{s}}) + A^{i-1}(s)\right)} - e^{\mathbf{x}\Lambda^{i-1}(s)} \right) \exp\left(B\mathbf{x} - C\mathbf{x}\right) \frac{dx}{\eta\chi} \\ & \quad - \int\_{0}^{+\infty} z \left( e^{\mathbf{x}\Lambda^{i}(s)} - e^{\mathbf{x}\Lambda^{i-1}(s)} \right) \exp\left(B\mathbf{x} - C\mathbf{x}\right) \frac{dx}{\eta\chi} \\ & \quad - \int\_{-\infty}^{0} \left( e^{\mathbf{x}\left(z\lambda(s,T^\*\_{\boldsymbol{s}}) + A^{i-1}(s)\right)} - e^{\mathbf{x}\Lambda^{i-1}(s)} \right) \exp\left(B\mathbf{x} + C\mathbf{x}\right) \frac{dx}{\eta\chi} \\ & \quad + \int\_{-\infty}^{0} z \left( e^{\mathbf{x}\Lambda^{i}(s)} - e^{\mathbf{x}\Lambda^{i-1}(s)} \right) \exp\left(B\mathbf{x} + C\mathbf{x}\right) \frac{dx}{\eta\chi}, \end{split}$$

or

*<sup>I</sup> V G*(*s*,*z*) <sup>=</sup> +∞ 0 ! *e*(*<sup>z</sup>*λ(*s*,*<sup>T</sup>* <sup>∗</sup> *<sup>i</sup>* )+Λ*i*−1(*s*)+*B*−*<sup>C</sup>*)*<sup>x</sup>* <sup>−</sup> *<sup>e</sup>*(Λ*<sup>i</sup>*−1(*s*)+*B*−*<sup>C</sup>*)*<sup>x</sup>* η*x* " *dx* − +∞ 0 ! *z e*(Λ*<sup>i</sup>* (*s*)+*B*−*<sup>C</sup>*)*<sup>x</sup>* <sup>−</sup> *<sup>e</sup>*(Λ*<sup>i</sup>*−1(*s*)+*B*−*<sup>C</sup>*)*<sup>x</sup>* η*x* " *dx* − <sup>0</sup> −∞ ! *e*(*<sup>z</sup>*λ(*s*,*<sup>T</sup>* <sup>∗</sup> *<sup>i</sup>* )+Λ*i*−1(*s*)+*B*+*<sup>C</sup>*)*<sup>x</sup>* <sup>−</sup> *<sup>e</sup>*(Λ*<sup>i</sup>*−1(*s*)+*B*+*<sup>C</sup>*)*<sup>x</sup>* η*x* " *dx* + <sup>0</sup> −∞ ! *z e*(Λ*<sup>i</sup>* (*s*)+*B*+*<sup>C</sup>*)*<sup>x</sup>* <sup>−</sup> *<sup>e</sup>*(Λ*<sup>i</sup>*−1(*s*)+*B*+*<sup>C</sup>*)*<sup>x</sup>* η*x* " *dx* = +∞ 0 ! *e*(*<sup>z</sup>*λ(*s*,*<sup>T</sup>* <sup>∗</sup> *<sup>i</sup>* )+Λ*i*−1(*s*)+*B*−*<sup>C</sup>*)*<sup>x</sup>* <sup>−</sup> *<sup>e</sup>*(Λ*<sup>i</sup>*−1(*s*)+*B*−*<sup>C</sup>*)*<sup>x</sup>* η*x* " *dx* − +∞ 0 ! *z e*(Λ*<sup>i</sup>* (*s*)+*B*−*<sup>C</sup>*)*<sup>x</sup>* <sup>−</sup> *<sup>e</sup>*(Λ*<sup>i</sup>*−1(*s*)+*B*−*<sup>C</sup>*)*<sup>x</sup>* η*x* " *dx* + +∞ 0 ! *e*−(*<sup>z</sup>*λ(*s*,*<sup>T</sup>* <sup>∗</sup> *<sup>i</sup>* )+Λ*i*−1(*s*)+*B*+*<sup>C</sup>*)*<sup>x</sup>* <sup>−</sup> *<sup>e</sup>*−(Λ*<sup>i</sup>*−1(*s*)+*B*+*<sup>C</sup>*)*<sup>x</sup>* η*x* " *dx* − +∞ 0 ! *z e*−(Λ*<sup>i</sup>* (*s*)+*B*+*<sup>C</sup>*)*<sup>x</sup>* <sup>−</sup> *<sup>e</sup>*−(Λ*<sup>i</sup>*−1(*s*)+*B*+*<sup>C</sup>*)*<sup>x</sup>* η*x* " *dx*.

Option Pricing and Sensitivity Analysis … 305

Using the notations

$$\alpha\_i(\mathbf{s}, z) = -\left(z\lambda(\mathbf{s}, T\_i^\*) + A^{i-1}(\mathbf{s}) + B - C\right),\tag{85}$$

$$\beta\_i(\mathbf{s}) = -\left(\boldsymbol{\Lambda}^{i-1}(\mathbf{s}) + \boldsymbol{B} - \mathbf{C}\right),\tag{86}$$

$$\gamma\_i(\mathbf{s}) = -\left(A^i(\mathbf{s}) + B - C\right),\tag{87}$$

we end up with

$$\begin{split} I^{VG}(s,z) &= \int\_{0}^{+\infty} \left[ \frac{e^{-a\_{l}(s,z)\mathbf{x}} - e^{-\beta\_{l}(s)\mathbf{x}}}{\mathbf{x}} - z \frac{e^{-\gamma\_{l}(s)\mathbf{x}} - e^{-\beta\_{l}(s)\mathbf{x}}}{\mathbf{x}} \right] d\mathbf{x} \\ &+ \int\_{0}^{+\infty} \left[ \frac{e^{-\left(2C-a\_{l}(s,z)\right)\mathbf{x}} - e^{-\left(2C-\beta\_{l}(s)\right)\mathbf{x}}}{\mathbf{x}} - z \frac{e^{-\left(2C-\gamma\_{l}(s)\right)\mathbf{x}} - e^{-\left(2C-\beta\_{l}(s)\right)\mathbf{x}}}{\mathbf{x}}}{\mathbf{x}} \right] d\mathbf{x} .\end{split}$$

Using Frullani's integral (see for details Ostrowski (1949) [16]), we can show that, if <sup>α</sup> <sup>∈</sup> <sup>C</sup> and <sup>β</sup> <sup>∈</sup> <sup>C</sup> such that *<sup>R</sup>e*(α) > 0, *<sup>R</sup>e*(β) > 0 and <sup>β</sup> <sup>α</sup> ∈ C \ R<sup>−</sup> where R<sup>−</sup> = ] − ∞; 0],

$$I\_{(\alpha,\beta)} := \int\_0^{+\infty} \frac{e^{-\alpha x} - e^{-\beta x}}{x} dx = \mathcal{L} \log \left(\frac{\beta}{\alpha}\right),\tag{88}$$

where *Log* is the principal value of the logarithm. Consequently

$$\begin{split} I^{VG}(s,z) &= \operatorname{Log} \left( \frac{\beta\_i(s)}{a\_i(s,z)} \right) - z \operatorname{Log} \left( \frac{\beta\_i(s)}{\wp\_i(s)} \right) \\ &+ \operatorname{Log} \left( \frac{2C - \beta\_i(s)}{2C - a\_i(s,z)} \right) - z \operatorname{Log} \left( \frac{2C - \beta\_i(s)}{2C - \gamma\_i(s)} \right) \\ &= \operatorname{Log} \left( \frac{\beta\_i(s)}{a\_i(s,z)} \right) + \operatorname{Log} \left( \frac{2C - \beta\_i(s)}{2C - \alpha\_i(s,z)} \right) \\ &- z \left( \operatorname{Log} \left( \frac{\beta\_i(s)}{\wp\_i(s)} \right) + \operatorname{Log} \left( \frac{2C - \beta\_i(s)}{2C - \gamma\_i(s)} \right) \right) \\ &= \operatorname{Log} \left( \frac{\beta\_i(s)}{a\_i(s,z)} \left( \frac{2C - \beta\_i(s)}{2C - \alpha\_i(s,z)} \right) \right) - z \operatorname{Log} \left( \frac{\beta\_i(s)}{\wp\_i(s)} \left( \frac{2C - \beta\_i(s)}{2C - \gamma\_i(s)} \right) \right). \end{split}$$

The moment-generating function *MXT* <sup>∗</sup> *i* becomes

$$\begin{split} M\_{X\_{\overline{\boldsymbol{T}}^\*}}(z) &= \left( F(0, T\_i^\*, T\_{i-1}^\*) \right)^z \exp\left( \int\_0^{T\_i^\*} \frac{c\_s z}{2} (z - 1) \lambda^2(s, T\_i^\*) ds \right) \\ &\times \exp\left( \int\_0^{T\_i^\*} L \log \left( \frac{\beta\_i(s) \left( 2C - \beta\_i(s) \right)}{a\_i(s, z) \left( 2C - a\_i(s, z) \right)} \right) ds \right) \\ &\times \exp\left( - \int\_0^{T\_i^\*} z L \log \left( \frac{\beta\_i(s) \left( 2C - \beta\_i(s) \right)}{\gamma\_i(s) \left( 2C - \gamma\_i(s) \right)} \right) ds \right) \end{split}$$

or

$$\begin{split} M\_{X\_{\overline{i}^\*}}(R+\mathrm{i}u) &= \left( F(0,T\_i^\*,T\_{i-1}^\*) \right)^{R+\mathrm{i}u} \\ &\times \exp\left( \int\_0^{T\_i^\*} \frac{c\_s}{2} (R+\mathrm{i}u)(R+\mathrm{i}u-1)\lambda^2(s,T\_i^\*)ds \right) \\ &\times \exp\left( \int\_0^{T\_i^\*} L\log\left(\frac{\beta\_i(s)\left(2C-\beta\_i(s)\right)}{\alpha\_i(s,R+\mathrm{i}u)\left(2C-\alpha\_i(s,R+\mathrm{i}u)\right)}\right) ds \right) \\ &\times \exp\left( -\int\_0^{T\_i^\*} (R+\mathrm{i}u)L\log\left(\frac{\beta\_i(s)\left(2C-\beta\_i(s)\right)}{\chi\_i(s)\left(2C-\chi\_i(s)\right)}\right)ds \right). \end{split}$$

The valuation formula becomes

$$\begin{split} Cph\_{0}(T\_{i}^{+},K) &= \frac{B(0,T\_{i-1}^{+})}{2\pi} \int\_{\mathbb{R}} \frac{\tilde{K}\_{i}^{1-R-iu}M\_{X\_{i}^{+}}(R+iu)}{(R+iu)(R+iu-1)} du \\ &= \frac{\tilde{K}\_{i}B(0,T\_{i-1}^{+})}{2\pi} \int\_{\mathbb{R}} \left\{ \left(\frac{F(0,T\_{i}^{+},T\_{i-1}^{+})}{\tilde{K}\_{i}}\right)^{R+iu} \right\}^{R+iu} \\ &\times \exp\left(\int\_{0}^{T\_{i}^{+}} \frac{c\_{s}}{2} (R+iu)(R+iu-1)\lambda^{2}(s,T\_{i}^{+}) ds\right) \\ &\times \exp\left(\int\_{0}^{T\_{i}^{+}} L\log\left(\frac{\beta\_{i}(s)\left(2C-\beta\_{i}(s)\right)}{a\gamma\_{i}(s,R+iu)\left(2C-a\_{i}(s,R+iu)\right)}\right) ds\right) \\ &\times \exp\left(-\int\_{0}^{T\_{i}^{+}} (R+iu)L\log\left(\frac{\beta\_{i}(s)\left(2C-\beta\_{i}(s)\right)}{\gamma\_{i}(s)\left(2C-\gamma\_{i}(s)\right)}\right) ds\right) \Big|\frac{du}{(R+iu)(R+iu-1)}.\tag{89}$$

The *Delta* is given by

$$\begin{split} \Delta(F(0, T\_{i}^{\*}, T\_{i-1}^{\*})) &= \frac{B(0, T\_{i-1}^{\*})}{2\pi} \int\_{\mathbb{R}} \left\{ \left( \frac{F(0, T\_{i}^{\*}, T\_{i-1}^{\*})}{\check{K}\_{i}} \right)^{R + \text{i}\text{i}\text{s} - 1} \right. \\ &\times \exp\left( \int\_{0}^{T\_{i}^{\*}} \frac{c\_{s}}{2} (R + \text{i}\text{u})(R + \text{i}\text{u} - 1)\lambda^{2}(s, T\_{i}^{\*}) ds \right) \\ &\times \exp\left( \int\_{0}^{T\_{i}^{\*}} L \log\left( \frac{\beta\_{i}(s)\left(2C - \beta\_{i}(s)\right)}{a\_{i}(s, R + \text{i}\text{u})\left(2C - a\_{i}(s, R + \text{i}\text{u})\right)} \right) ds \right) \\ &\times \exp\left( -\int\_{0}^{T\_{i}^{\*}} (R + \text{i}\text{u})L \log\left( \frac{\beta\_{i}(s)\left(2C - \beta\_{i}(s)\right)}{\chi\_{i}(s)\left(2C - \chi\_{i}(s)\right)} \right) ds \right) \Big| \frac{du}{R + \text{i}u - 1}. \end{split} \tag{90}$$

#### **4.3.2 Inhomogeneous Gamma Process (IGP)**

We suppose that the jump component of the driving process *L<sup>T</sup>* <sup>∗</sup> , is described by an inhomogeneous Gamma process (IGP), which has been introduced by Berman (1981) [1] as follows

**Definition 4.4** Let *A*(*t*) be a nondecreasing function from R<sup>+</sup> −→ R<sup>+</sup> and *B* > 0. A Gamma process with shape function *A* and scale parameter *B* is a stochastic process (*Lt*)*<sup>t</sup>*≥<sup>0</sup> on R<sup>+</sup> such that:


We suppose that the shape function *A* is differentiable, hence we can write

$$A(t) = A(0) + \int\_0^t \dot{A}(s)ds\tag{91}$$

for all *t* ∈ R<sup>+</sup> where *A*˙ denotes the derivative of *A*. In this case, the Lévy density of the Gamma process *L* is given by

$$F\_s^G(\mathbf{x}) = \dot{A}(\mathbf{s}) \frac{e^{-B\mathbf{x}}}{\mathbf{x}} \mathbf{1}\_{\{\mathbf{x} > 0\}}.\tag{92}$$

The moment-generating function (53) has the form

*MXT* <sup>∗</sup> *i* (*z*) = *F*(0, *T* ∗ *<sup>i</sup>* , *T* <sup>∗</sup> *<sup>i</sup>*−1) *<sup>z</sup>* exp *<sup>T</sup>* <sup>∗</sup> *i* 0 *csz* <sup>2</sup> (*<sup>z</sup>* <sup>−</sup> <sup>1</sup>)λ2(*s*, *<sup>T</sup>* <sup>∗</sup> *<sup>i</sup>* )*ds* <sup>×</sup> exp *<sup>T</sup>* <sup>∗</sup> *i* 0 R *ex*Λ*i*−1(*s*) *ezx*λ(*s*,*<sup>T</sup>* <sup>∗</sup> *<sup>i</sup>* ) <sup>−</sup> <sup>1</sup> − *z* - *ex*λ(*s*,*<sup>T</sup>* <sup>∗</sup> *<sup>i</sup>* ) <sup>−</sup> <sup>1</sup> *F<sup>G</sup> <sup>s</sup>* (*x*)*dxds* = *F*(0, *T* ∗ *<sup>i</sup>* , *T* <sup>∗</sup> *<sup>i</sup>*−1) *<sup>z</sup>* exp *<sup>T</sup>* <sup>∗</sup> *i* 0 *csz* <sup>2</sup> (*<sup>z</sup>* <sup>−</sup> <sup>1</sup>)λ2(*s*, *<sup>T</sup>* <sup>∗</sup> *<sup>i</sup>* )*ds* <sup>×</sup> exp *<sup>T</sup>* <sup>∗</sup> *i* 0 *A*˙(*s*) R *ex*Λ*i*−1(*s*) - *ezx*λ(*s*,*<sup>T</sup>* <sup>∗</sup> *<sup>i</sup>* ) <sup>−</sup> <sup>1</sup> *<sup>e</sup>*−*Bx <sup>x</sup>* **<sup>1</sup>**{*x*>0}*dxds* <sup>×</sup> exp −*z <sup>T</sup>* <sup>∗</sup> *i* 0 *A*˙(*s*) R *ex*Λ*i*−1(*s*) - *ex*λ(*s*,*<sup>T</sup>* <sup>∗</sup> *<sup>i</sup>* ) <sup>−</sup> <sup>1</sup> *<sup>e</sup>*−*Bx <sup>x</sup>* **<sup>1</sup>**{*x*>0}*dxds* = *F*(0, *T* ∗ *<sup>i</sup>* , *T* <sup>∗</sup> *<sup>i</sup>*−1) *<sup>z</sup>* exp *<sup>T</sup>* <sup>∗</sup> *i* 0 *csz* <sup>2</sup> (*<sup>z</sup>* <sup>−</sup> <sup>1</sup>)λ2(*s*, *<sup>T</sup>* <sup>∗</sup> *<sup>i</sup>* ) <sup>+</sup> *<sup>A</sup>*˙(*s*)*<sup>I</sup> <sup>G</sup>*(*s*,*z*) *ds* ,

where

$$\begin{aligned} I^G(s, z) &= \int\_0^{+\infty} \frac{e^{\left(z\lambda(s, T\_i^\*) + A^{i^{-1}(s) - B}\right)x} - e^{\left(A^{i^{-1}(s) - B}\right)x}}{x} dx \\ &- \int\_0^{+\infty} z \frac{e^{\left(A^i(s) - B\right)x} - e^{\left(A^{i^{-1}(s) - B}\right)x}}{x} dx. \end{aligned}$$

Setting

$$\alpha\_i(\mathbf{s}, z) = -\left(z\lambda(\mathbf{s}, T\_i^\*) + \Lambda^{i-1}(\mathbf{s}) - B\right),\tag{93}$$

$$\beta\_i(\mathbf{s}) = -\left(\boldsymbol{A}^{i-1}(\mathbf{s}) - \boldsymbol{B}\right),\tag{94}$$

$$\gamma\_i(\mathbf{s}) = -\left(A^i(\mathbf{s}) - B\right) \tag{95}$$

and using Frullani's integral, we find that

$$\begin{split} I^{G}(s,z) &= \int\_{0}^{+\infty} \left[ \frac{e^{-\alpha\_{i}(s,z)x} - e^{-\beta\_{i}(s)x}}{x} - z \frac{e^{-\gamma(s)x} - e^{-\beta\_{i}(s)x}}{x} \right] dx \\ &= \mathcal{L}og\left(\frac{\beta\_{i}(s)}{a\_{i}(s,z)}\right) - z \mathcal{L}og\left(\frac{\beta\_{i}(s)}{\gamma\_{i}(s)}\right) \\ &= \mathcal{L}og\left(\frac{\Lambda^{i-1}(s) - B}{z\lambda(s,T\_{i}^{\*}) + \Lambda^{i-1}(s) - B}\right) - z \mathcal{L}og\left(\frac{\Lambda^{i-1}(s) - B}{\Lambda^{i}(s) - B}\right). \end{split}$$

Therefore, we get the form

$$\begin{split} M\_{X\_{\boldsymbol{T}\_{i}^{\*}}}(z) &= \left( F(0, T\_{i}^{\*}, T\_{i-1}^{\*}) \right)^{\boldsymbol{z}} \exp \left( \int\_{0}^{T\_{i}^{\*}} \frac{c\_{\boldsymbol{z}} \mathbb{1}}{2} (\boldsymbol{z} - 1) \boldsymbol{\lambda}^{2} (\boldsymbol{s}, T\_{i}^{\*}) ds \right) \\ & \quad \times \exp \left( \int\_{0}^{T\_{i}^{\*}} \dot{\boldsymbol{A}}(s) \boldsymbol{L} \log \left( \frac{\boldsymbol{\Lambda}^{i-1}(\boldsymbol{s}) - \boldsymbol{B}}{\boldsymbol{z} \boldsymbol{\lambda}(\boldsymbol{s}, T\_{i}^{\*}) + \boldsymbol{A}^{i-1}(\boldsymbol{s}) - \boldsymbol{B}} \right) ds \right) \\ & \quad \times \exp \left( -z \int\_{0}^{T\_{i}^{\*}} \dot{\boldsymbol{A}}(s) \boldsymbol{L} \log \left( \frac{\boldsymbol{\Lambda}^{i-1}(\boldsymbol{s}) - \boldsymbol{B}}{\boldsymbol{\Lambda}^{i}(\boldsymbol{s}) - \boldsymbol{B}} \right) ds \right) \end{split}$$

or

$$\begin{split} M\_{X\_{l\_{i}^{\*}}}(R+\mathrm{i}\boldsymbol{u}) &= \left( F(0,T\_{i}^{\*},T\_{i-1}^{\*}) \right)^{R+\mathrm{i}\boldsymbol{u}} \\ &\times \exp\left( \int\_{0}^{T\_{i}^{\*}} \frac{c\_{s}}{2} (R+\mathrm{i}\boldsymbol{u})(R+\mathrm{i}\boldsymbol{u}-1)\boldsymbol{\lambda}^{2}(\boldsymbol{s},T\_{i}^{\*}) d\boldsymbol{s} \right) \\ &\times \exp\left( \int\_{0}^{T\_{i}^{\*}} \dot{\boldsymbol{A}}(\boldsymbol{s}) L \log\left( \frac{\boldsymbol{A}^{i-1}(\boldsymbol{s})-B}{(R+\mathrm{i}\boldsymbol{u})\boldsymbol{\lambda}(\boldsymbol{s},T\_{i}^{\*}) + \boldsymbol{A}^{i-1}(\boldsymbol{s})-B} \right) d\boldsymbol{s} \right) \\ &\times \exp\left( -(R+\mathrm{i}\boldsymbol{u}) \int\_{0}^{T\_{i}^{\*}} \dot{\boldsymbol{A}}(\boldsymbol{s}) L \log\left( \frac{\boldsymbol{A}^{i-1}(\boldsymbol{s})-B}{\boldsymbol{A}^{i}(\boldsymbol{s})-B} \right) d\boldsymbol{s} \right) . \end{split}$$

The valuation formula becomes

$$\begin{split} \operatorname{Cpl}\_{0}(T\_{i}^{\*},K) &= \frac{B(\mathbf{0},T\_{i-1}^{\*})}{2\pi} \int\_{\mathbb{R}} \frac{\widetilde{K}\_{i}^{1-R-\mathrm{i}u}M\_{X\_{\widetilde{I}\_{i}^{\*}}}(R+\mathrm{i}u)}{(R+\mathrm{i}u)(R+\mathrm{i}u-1)} du \\ &= \frac{\widetilde{K}\_{i}B(\mathbf{0},T\_{i-1}^{\*})}{2\pi} \int\_{\mathbb{R}} \frac{du}{(R+\mathrm{i}u)(R+\mathrm{i}u-1)} \left\{ \left(\frac{F(\mathbf{0},T\_{i}^{\*},T\_{i-1}^{\*})}{\widetilde{K}\_{i}}\right)^{R+\mathrm{i}u} \right\} \end{split}$$

$$\begin{split} & \times \exp\left(\int\_{0}^{T\_{i}^{\*}} \frac{c\_{s}}{2} (R + \mathrm{i}\mu)(R + \mathrm{i}\mu - 1)\boldsymbol{\lambda}^{2}(\mathrm{s}, T\_{i}^{\*}) d\boldsymbol{s}\right) \\ & \times \exp\left(\int\_{0}^{T\_{i}^{\*}} \dot{A}(\boldsymbol{s}) \mathrm{Log}\left(\frac{\boldsymbol{A}^{i-1}(\boldsymbol{s}) - \boldsymbol{B}}{(R + \mathrm{i}\mu)\boldsymbol{\lambda}(\mathrm{s}, T\_{i}^{\*}) + \boldsymbol{A}^{i-1}(\boldsymbol{s}) - \boldsymbol{B}}\right) d\boldsymbol{s}\right) \\ & \times \exp\left(-\int\_{0}^{T\_{i}^{\*}} (R + \mathrm{i}\mu)\dot{A}(\boldsymbol{s}) \mathrm{Log}\left(\frac{\boldsymbol{A}^{i-1}(\boldsymbol{s}) - \boldsymbol{B}}{\boldsymbol{A}^{i}(\boldsymbol{s}) - \boldsymbol{B}}\right) d\boldsymbol{s}\right)\right). \end{split} \tag{96}$$

The Greek *Delta* is given by

$$\begin{split} \Delta(F(0, T\_{i}^{\*}, T\_{i-1}^{\*})) &= \frac{B(0, T\_{i-1}^{\*})}{2\pi} \int\_{\mathbb{R}} \left\{ \left( \frac{F(0, T\_{i}^{\*}, T\_{i-1}^{\*})}{\bar{K}\_{i}} \right)^{K+u-1} \right. \\ &\times \exp\left( \int\_{0}^{T\_{i}^{\*}} \frac{c\_{S}}{2} (R+\text{i}u)(R+\text{i}u-1)\lambda^{2}(s, T\_{i}^{\*}) ds \right) \\ &\times \exp\left( \int\_{0}^{T\_{i}^{\*}} \dot{A}(s) L \log\left( \frac{\Lambda^{i-1}(s)-B}{(R+\text{i}u)\lambda(s, T\_{i}^{\*}) + \Lambda^{i-1}(s)-B} \right) ds \right) \\ &\times \exp\left( -\int\_{0}^{T\_{i}^{\*}} (R+\text{i}u)\dot{A}(s) L \log\left( \frac{\Lambda^{i-1}(s)-B}{\Lambda^{i}(s)-B} \right) ds \right) \Big| \frac{du}{R+\text{i}u-1} . \end{split} \tag{97}$$

**Acknowledgments** The KPMG Center of Excellence in Risk Management is acknowledged for organizing the conference "Challenges in Derivatives Markets - Fixed Income Modeling, Valuation Adjustments, Risk Management, and Regulation".

# **A Appendix**

# *A.1 Isonormal Lévy Process (ILP)*

Let μ and ν be σ-finite measures without atoms on the measurable spaces (T, *A* ) and (T × X0, *B*), respectively. Define a new measure

$$
\pi(dt, dz) := \mu(dt)\delta\_{\Theta}(dz) + \nu(dt, dz) \tag{98}
$$

on a measurable space (T × X, *G* ), where X = X0 ∪ {Θ}, *G* = σ (*A* × {Θ}, *B*) and δΘ(*dz*) is the measure which gives mass one to the point Θ. We assume that the Hilbert space H := L<sup>2</sup>(T × X, *G* ,π) is separable.

**Definition A.1** We say that a stochastic process L = {L(*h*), *h* ∈ H} defined on a complete probability space (Ω, *F*, *P*)is an isonormal Lévy process (or Lévy process on *H*) if the following conditions are satisfied:

1. The mapping *h* −→ *L*(*h*) is linear;

2. E[*eixL*(*h*) ] = exp(Ψ (*x*, *h*)), where Ψ (*x*, *h*) is equal to

$$\int\_{\mathbb{T}\times X} \left( (e^{\mathrm{i}\imath h(t,z)} - 1 - \mathrm{i}\imath h(t,z)) \mathbf{1}\_{\mathbb{X}\_0}(z) - \frac{1}{2} \mathrm{x}^2 h^2(t,z) \mathbf{1}\_{\Theta}(z) \right) \pi(dt,dz). \tag{99}$$

# *A.2 The Derivative Operator*

Let *S* denote the class of smooth random variables, that is the class of random variables ξ of the form

$$\xi = f(L(h\_1), \dots, L(h\_n)),\tag{100}$$

where *f* belongs to C<sup>∞</sup> *<sup>b</sup>* (R*<sup>n</sup>*), *h*1,..., *hn* are in *H*, and *n* ≥ 1. The set *S* is dense in *L <sup>p</sup>*(Ω) for any *p* ≥ 1.

**Definition A.2** The stochastic derivative of a smooth random variable of the form (100) is the *H*-valued random variable *D*ξ = {*Dt*,*<sup>x</sup>* ξ,(*t*, *x*) ∈ *T* × *X*} given by

$$D\_{t, \mathbf{x}} \xi = \sum\_{k=1}^{n} \frac{\partial f}{\partial \mathbf{y}\_k} (L(h\_1), \dots, L(h\_n)) h\_k(t, \mathbf{x}) \mathbf{1}\_{\Theta}(\mathbf{x})$$

$$\begin{aligned} &+ \left( f\left( L(h\_1) + h\_1(t, \mathbf{x}), \dots, L(h\_n) + h\_n(t, \mathbf{x}) \right) \\ &- f\left( L(h\_1), \dots, L(h\_n) \right) \mathbf{1}\_{X\_0}(\mathbf{x}). \end{aligned} \tag{101}$$

We will consider *<sup>D</sup>*<sup>ξ</sup> as an element of *<sup>L</sup>*<sup>2</sup>(*<sup>T</sup>* <sup>×</sup> *<sup>X</sup>* <sup>×</sup> Ω) ∼= *<sup>L</sup>*<sup>2</sup>(Ω; *<sup>H</sup>*); namely *<sup>D</sup>*<sup>ξ</sup> is a random process indexed by the parameter space *T* × *X*.


# *A.3 Integration by Parts Formula*

**Theorem A.3** *Suppose that* ξ *and* η *are smooth random variables and h* ∈ *H. Then 1.*

$$\mathbb{E}[\xi L(h)] = \mathbb{E}[\langle D\xi; h\rangle\_H];\tag{102}$$

*2.*

$$\mathbb{E}[\xi \eta L(h)] = \mathbb{E}[\eta \langle D\xi; h \rangle\_H] + \mathbb{E}[\xi \langle D\eta; h \rangle\_H] + \mathbb{E}[\langle D\eta; h \mathbf{1}\_{X\_0} D\xi \rangle\_H]. \tag{103}$$

As a consequence of the above theorem we obtain the following result:

The expression of the derivative *D*ξ given in (101) does not depend on the particular representation of ξ in (100).

The operator *D* is closable as an operator from *L*<sup>2</sup>(Ω) to *L*<sup>2</sup>(Ω; *H*).

We will denote the closure of *D* again by *D* and its domain in *L*<sup>2</sup>(Ω) by D1,2.

# *A.4 The Chain Rule*

**Proposition A.4** (see Yablonski (2008), Proposition 4.8) *Suppose F* = (*F*1, *F*2, ..., *Fn*) *is a random vector whose components belong to the space* D1,2*. Let* φ ∈ *C* <sup>1</sup>(R*<sup>n</sup>*) *be a function with bounded partial derivatives such that* φ(*F*) ∈ L<sup>2</sup>(Ω)*. Then* φ(*F*) ∈ D1,<sup>2</sup> *and*

$$D\_{l, \chi} \phi(F) = \begin{cases} \sum\_{l=1}^{n} \frac{\partial \phi}{\partial \chi\_l}(F) D\_{l, \Theta} F\_l; & \mathfrak{x} = \Theta\\ \phi(F\_l + D\_{l, \chi} F\_l, \dots, F\_n + D\_{l, \chi} F\_n) - \phi(F\_l, \dots, F\_n); \mathfrak{x} \neq \Theta. \end{cases} \tag{104}$$

# *A.5 Regularity of Solutions of SDEs Driven by Time-Inhomogeneous Lévy Processes*

We focus on a class of models in which the price of the underlying asset is given by the following stochastic differential equation (see Di Nunno et al. [2] and Petrou [17] for details)

$$\begin{split} dS\_{l} &= b(t, S\_{t-})dt + \sigma(t, S\_{t-})dW\_{t} \\ &\quad + \int\_{\mathbb{R}\_{0}} \varphi(t, S\_{t-}, z)\tilde{N}(dt, dz), \\ S\_{0} &= x, \end{split} \tag{105}$$

where R<sup>0</sup> := R*<sup>d</sup>* \ {0<sup>R</sup>*<sup>d</sup>* }, *x* ∈ R*<sup>d</sup>* , {*Wt*, 0 ≤ *t* ≤ *T* } is an *m*-dimensional standard Brownian motion, *N*˜ is a compensated Poisson random measure on [0, *T* ] × R<sup>0</sup> with compensator ν*t*(*dz*)*dt*. The coefficients *b* : R<sup>+</sup> × R*<sup>d</sup>* −→ R*<sup>d</sup>* , σ : R<sup>+</sup> × R*<sup>d</sup>* −→ R*<sup>d</sup>* × R*<sup>m</sup>* and ϕ : R<sup>+</sup> × R*<sup>d</sup>* × R −→ R*<sup>d</sup>* × R, are continuously differentiable with bounded derivatives and the family of positive measures(ν*t*)*<sup>t</sup>*∈[0,*<sup>T</sup>* ] satisfies *T* 0 ( R0 (*z*<sup>2</sup> ∧ 1)ν*t*(*dz*))*dt* < ∞ and ν*t*({0}) = 0. The coefficients are assumed to satisfy the following linear growth condition

312 E. Eberlein et al.

$$\left\|\left\|b(t,\mathbf{x})\right\|\right\|^2 + \left\|\sigma(t,\mathbf{x})\right\|^2 + \int\_{\mathbb{R}\_0} \left\|\varphi(t,\mathbf{x},z)\right\|^2 \nu\_t(dz) \le C(1 + \left\|\mathbf{x}\right\|^2), \quad (106)$$

for all *t* ∈ [0, *T* ], *x* ∈ R*<sup>d</sup>* , where *C* is a positive constant. Furthermore, we suppose that there exists a function ρ : R −→ R with

$$\sup\_{0 \le t \le T} \int\_{\mathbb{R}\_0} |\rho(z)|^2 \nu\_t(dz) < \infty,\tag{107}$$

and a positive constant *K* such that

$$\|\|\varphi(t,\mathbf{x},z) - \varphi(t,\mathbf{y},z)\|\| \le K|\rho(z)|\|\mathbf{x} - \mathbf{y}\|,\tag{108}$$

for all *t* ∈ [0, *T* ], *x*, *y* ∈ R*<sup>d</sup>* and *z* ∈ R0.

In the sequel we provide a theorem which proves that under specific conditions the solution of a stochastic differential equation belongs to the domain D1,2.

**Theorem A.5** *Let* (*St*)*<sup>t</sup>*∈[0,*<sup>T</sup>* ] *be the solution of (105). Then St* ∈ D1,<sup>2</sup> *for all t* ∈ [0, *T* ] *and the derivative Dr*,<sup>0</sup> *St satisfies the following linear equation*

$$\begin{split} D\_{r,0}S\_t &= \int\_r^t \frac{\partial b}{\partial \mathbf{x}}(u, S\_{u-}) D\_{r,0} S\_{u-} du + \sigma(r, S\_{r-}) \\ &+ \int\_r^t \frac{\partial \sigma}{\partial \mathbf{x}}(u, S\_{u-}) D\_{r,0} S\_{u-} dW\_u \\ &+ \int\_r^t \int\_{\mathbb{R}\_0} \frac{\partial \varphi}{\partial \boldsymbol{\omega}}(u, S\_{u-}, \mathbf{y}) D\_{r,0} S\_{u-} \tilde{N}(du, d\mathbf{y}) \end{split} \tag{109}$$

*for* 0 ≤ *r* ≤ *t a.e. and Dr*,<sup>0</sup> *St* = 0 *a.e. otherwise.*

As in the classical Malliavin calculus we are able to associate the solution of (105) to the first variation process *Yt* := ∇*<sup>x</sup> St* . Then, we will also provide a specific expression for *Dr*,<sup>0</sup> *St*, the Wiener directional derivative of the *St* .

**Proposition A.6** *Let* (*St*)*<sup>t</sup>*∈[0,*<sup>T</sup>* ] *be the solution of (105). Then the derivative satisfies the following equation*

$$D\_{r,0} \mathbf{S}\_t = Y\_t Y\_{r-}^{-1} \sigma(r, \mathbf{S}\_{r-}) \mathbf{1}\_{\{r \le t\}} \ a.e. \tag{110}$$

*where* (*Yt*)*<sup>t</sup>*∈[0,*<sup>T</sup>* ] *is the first variation process of* (*St*)*<sup>t</sup>*∈[0,*<sup>T</sup>* ]*.*

**Open Access** This chapter is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, duplication, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, a link is provided to the Creative Commons license and any changes made are indicated.

The images or other third party material in this chapter are included in the work's Creative Commons license, unless indicated otherwise in the credit line; if such material is not included in the work's Creative Commons license and the respective action is not permitted by statutory regulation, users will need to obtain permission from the license holder to duplicate, adapt or reproduce the material.

# **References**


# **Inside the EMs Risky Spreads and CDS-Sovereign Bonds Basis**

**Vilimir Yordanov**

**Abstract** The paper considers a no-arbitrage setting for pricing and relative value analysis of risky sovereign bonds. The typical case of an emerging market country (EM) that has bonds outstanding both in foreign hard currency (Eurobonds) and local soft currency (treasuries) is inspected. The resulting two yield curves give rise to a credit and currency spread that need further elaboration. We discuss their proper measurement and also derive and analyze the necessary no-arbitrage conditions that must hold. Then we turn attention to the CDS-Bond basis in this multi-curve environment. For EM countries the concept shows certain specifics both in theoretical background and empirical performance. The paper further focuses on analyzing these peculiarities. If the proper measurement of the basis in the standard case of only hard currency debt being issued is still problematic, the situation is much more complicated in a multi-curve setting when a further contingent claim on the sovereign risk in the face of local currency debt curve appears. We investigate the issue and provide relevant theoretical and empirical input.

**Keywords** HJM · Foreign debt · Domestic debt · Z-Spread · CDS-Bond basis

# **1 Introduction**

Local currency debt of EM sovereigns became a hot topic both for practitioners and academics in the recent years. Major investment banks and asset managers consider it a separate asset class and publish regularly special local currency investment reports. A joint working group of IMF, WB, EBRD, and OECD demonstrated recently an official interest in a thorough investigation of this market segment and support for its development, thus forming a strict policy agenda. It was recognized that not only do the local bonds complete the market and thus bring market efficiency, but also

V. Yordanov (B)

Vienna Graduate School of Finance, Vienna, Austria e-mail: research@vjord.com

V. Yordanov Karlsruhe Institute of Technology, Karlsruhe, Germany

they could act as a shock absorber to the volatile capital inflows. Furthermore, they provide flexibility to the governments in financing their budget deficit. However, these instruments are not well understood from a no-arbitrage point of view and a formal setting is lacking. Such a setting would provide not only a better picture for their inherent risk-return characteristics, but would also be an indispensable tool for market research and strategy. The aim of this paper is exactly to focus attention on the large set of open questions the local currency debt gives rise to and lay the ground for a formal relative value analysis with a special emphasis on the CDS-Bond basis.

The paper begins with our general modeling no-arbitrage approach under an HJM reduced credit risk setting. It serves as a basis and gives a financial engineering intuition about the nature of the problem. The default of the sovereign is represented as the first jump of a counting process. For the dynamics of the interest rates and the exchange rate we use jump diffusions controlling the jumps and correlations in a suitable way, so that we have high precision in capturing the structural macrofinancial effects. We derive the no-arbitrage conditions that must hold in that multi-curve environment and then analyze their informational content. Then we turn to an application related to correctly extracting the credit and currency spreads and measuring the CDS-Bond basis on a broader scope. This provides basic building blocks for relative value trades under presence of the local currency yield curve which could serve as an additional pillar.

The literature on integrating the foreign and domestic debt of a risky sovereign in a consistent way is at a nascent stage both from an academic and practitioners' point of view. Related technically but different in essence is the work of Ehlers and Schönbucher [9] who give a reduced form model for CDS of an obligor denominated in different currencies which accounts for dependence between the exchange rate and the credit spread. Eberlain and Koval [8] give a high generalization of the crosscurrency term structure models, but similarly they deal only with hard currencies. Regarding the CDS-Bond basis, Berd et al. [2] provide a thorough analysis of the shortcomings of the Z-spread as a risky spread metric.1 Alizalde et al. [10] further discuss the issue and provide extensive simulations. Interesting new measures for the basis are given in Bernhard and Mai [3] which need further elaboration and development. However, all these references deal with the single-curve case with an extension to the multi-curve case pending.

# **2 Local Currency Bonds No-Arbitrage HJM Setting**

In this section we first lay the foundations in brief for pricing of risky debt in a general reduced form setting. Then we add the local currency debt into the picture and discuss the risky spreads. We conclude by derivation and analysis of the noarbitrage conditions.

<sup>1</sup>The Z-spread represents a simple shift of the discounting risk-free curve so that the price of the risky bond is attained.

# *2.1 Risky Bonds Under Marked Point Process*

The first task is to model default in a suitable way. We start with the most general formulation and then modify it appropriately. We consider a filtered probability space (Ω,(*Gt*)*<sup>t</sup>*≥<sup>0</sup> , *<sup>P</sup>*) which supports an *<sup>n</sup>*-dimensional Brownian motion *<sup>W</sup> <sup>P</sup>* <sup>=</sup> (*W*1, *W*2,..., *Wn*) under the objective probability measure *P* and a marked point process μ : - Ω, *B*(*R*+), ε → *R*<sup>+</sup> with markers(τ*<sup>i</sup>* , *Xi*)representing the jump times and their sizes in a measurable space (*E*, ε), where *E* = [0, 1] and by ε we denote the Borel subsets of *E*. We assume that μ(ω; *dt*, *dx*) has a separable compensator of the form:

$$\psi : \left(\Omega, B(R^+), \varepsilon\right) \to R^+ \quad \text{and} \quad \upsilon\left(\omega; dt, dx\right) = h(\omega; t) F\_t(\omega; dx) dt,$$

where *h*(ω;*t*) = *<sup>R</sup>*<sup>+</sup> υ (ω;*t*, *dx*) is a *Gt* measurable intensity and the marks have a conditional distribution of the jumps of *Ft*(ω; *dx*). Thus, we have the identity *<sup>E</sup> Ft* (ω; *dx*) = 1. Furthermore, we can define the total loss function *L*(ω;*t*) = *t* 0 *<sup>E</sup> <sup>l</sup>*(ω;*s*, *<sup>x</sup>*)μ(ω; *ds*, *dx*) and the recovery *<sup>R</sup>*(ω;*t*) <sup>=</sup> <sup>1</sup> <sup>−</sup> *<sup>t</sup>* 0 *<sup>E</sup> l*(ω;*s*, *x*)μ (ω; *ds*, *dx*). The function *l*(ω;*t*, *x*) scales the marks in a suitable way, and having control over it, we can define it such that our model is tractable enough. We define also the sum of the jumps by *<sup>S</sup>*(ω;*t*) <sup>=</sup> *<sup>t</sup>* 0 *<sup>E</sup> x*μ(ω; *ds*, *dx*) and their number by *<sup>N</sup>*(ω;*t*) <sup>=</sup> *<sup>t</sup>* 0 *<sup>E</sup>* μ(ω; *ds*, *dx*).

Effectively, the marked point process as a sequence of random jumps (τ*<sup>i</sup>* ,*Xi*) is characterized by the probability measure μ(ω; *dt*, *dx*), which gives the number of jumps with size *dx* in a small time interval of *dt*. The compensator υ (ω;*t*, *dx*) provides a full probability characterization of the process. It incorporates in itself two effects. On one hand, we have the intensity *h*(ω;*t*)*dt*, which gives the conditional probability of jump of the process in a small time interval of *dt* incorporating the whole market information up to *t*. On the other hand, we have the conditional distribution *Ft*(ω; *dx*) of the markers *X* in case of a jump realization.

We can look at the jumps of the marked point process as sequential defaults of an obligor at random times τ*<sup>i</sup>* that lead to losses *Xi* at each of them. They can also be considered a set of restructuring events leading to losses for the creditors. Under this general setting, the prices of the riskless and risky bonds are given by:

# • **Riskless bond**:

$$P(t,T) = E^{\mathcal{Q}} \left( \exp \left( -\int\_t^T r(s)ds \right) | G\_t \right) = \exp \left( -\int\_t^T f(t,s)ds \right) \tag{1}$$

# • **Risky bond**:

$$\begin{split}P^\*(t,T) &= E^{\mathcal{Q}}\left(\exp\left(-\int\_{t}^{T} r(s)ds\right)R(\omega;T)|G\_t\right) \\ &= R(t)\exp\left(-\int\_{t}^{T} f^\*(t,s)ds\right),\end{split} \tag{2}$$

where *r*(*t*), *f* (*t*, *T* ), and *f* <sup>∗</sup>(*t*, *T* ) are the riskless spot, riskless forward, and risky forward rates respectively.

Depending on how we specify the convention of recovery, we can get further simplification of the formulas. However, this should be well motivated and come either from the legal definitions of the debt contracts or their economic grounding.

Under a recovery of market value (RMV) setting, default is a percentage mark down, *q*, from the previous recovery. So we have *R*(ω; τ*i*) = (1 − *q*(ω; τ*i*, *Xi*)) *R*(ω; τ*i*−) and *l*(ω; τ*i*) has the form *l*(ω; τ*i*) = −*q*(ω; τ*i*, *Xi*) × *R*(ω; τ*i*−). This definition allows us to write:

$$\mu(\omega, dt, d\mathbf{x}) = \sum\_{\mathbf{s} > 0} \mathbf{1}\_{\{\Delta N(\omega, \mathbf{s}) \neq 0\}} \delta\_{\{\mathbf{s}, \Delta N(\omega, \mathbf{s})\}}(dt, d\mathbf{x})$$

$$dR(\omega; t) = -R(\omega; t) \int\_{E} q \ (\omega; t, \mathbf{x}) \, \mu(\omega; dt, d\mathbf{x}); R(\omega; \mathbf{0}) = 1$$

and if we assume no jumps of the intensity and the risk-free rate at default times (contagion effects), we have no change for the risk-free bond pricing formula and for the risky one and as in [13] we get:

$$\begin{aligned} P^{\*RMV}(t,T) &= E^{\mathcal{Q}} \left( \exp \left( -\int\_{t}^{T} r(s)ds \right) R(\omega;T) |G\_{I} \right) \\ &= R(t)E^{\mathcal{Q}} \left( \exp \left( -\int\_{t}^{T} (r(s) + h(s) \int\_{E} q \left( \omega; s, x \right) F\_{s}(dx \right)) ds \right) |G\_{I} \right) \\ &= R(t) \exp \left( -\int\_{t}^{T} f^{\*RMV}(t,s) ds \right) \end{aligned} \tag{3}$$

Note that within this setting there is no "last default". The intensity is defined for the whole marked point process and not just for a concrete single default time, so it does not go to zero after default realizations. This combined with the fact that intensity is continuous makes the market filtration *Gt* behave like a background filtration in the pricing formulas. So we can avoid using the generalized Duffie, Schroder, and Skiadas [7] formula. Furthermore, we can denote *qe*(*t*) = *<sup>E</sup> q* (ω;*t*, *x*) *Ft*(*dx*) to be the expected loss. So we have that the pricing formula is dependent on the generalized intensity *h*(*t*)*qe*(*t*). Due to the multiplicative nature of the last expression, only from market information, as discussed in Schönbucher (2003), we cannot distinguish between the pure intensity effect *h*(*t*) and the recovery induced one *qe*(*t*).

Under a recovery of par (RP) setting, in case of default, the recovery is a separate fixed or random quantity independent of the default indicator and the risk-free rate. So we have *E* = {0, 1 − *R* (ω)} and υ (ω; *dt*, *dx*) = *h*(ω;*t*)(1 − *Re*)*dt* with *Re* = *E <sup>Q</sup>*(*R* (ω) | *Gt*). Since we have just one jump, we can write:

$$(\mu(\omega, dt, d\mathfrak{x}) = 1\_{\{\Delta N(\omega, t) \neq 0\}} \delta\_{(t, \Delta N(\omega, t))} (dt, d\mathfrak{x})) $$

The bond price is:

$$\begin{split} P^{\*RP}(t,T) &= E^{\mathcal{Q}} \left( \exp \left( -\int\_{t}^{T} r(s)ds \right) \left( R \left( \omega \right) \mathbf{1}\_{\{\tau \le T\}} + \mathbf{1}\_{\{\tau > T\}} \right) \left| G\_{t} \right) \right. \\ &= \mathbf{1}\_{\{\tau > t\}} \exp \left( -\int\_{t}^{T} f^{\*RP}(t,s)ds \right) \end{split} \tag{4}$$

In contrast to RMV, here, as discussed in Schönbucher (2003), he can distinguish between the pure intensity and recovery induced effects.

# *2.2 Model Formulation*

In this section we develop our HJM model for pricing of local and foreign currency bonds of a risky country. However, before this being done formally, it is essential to elaborate on the nature of the problem. Although we do not put here explicitly macrofinancial structure, but just proxy it by jumps and correlations, it, by all means, stays in the background and must be conceptually considered.

#### **2.2.1 General Notes**

A risky emerging market country can have bonds denominated both in local and foreign currency that give rise to two risky yield curves and risky spreads—credit and currency. Generally, the latter arise due to the possibility of the respective credit events to occur and their severity. To investigate them, formal assumptions are needed both on their characteristics and interdependence.

We will consider that the two types of debt have different priorities. The country is first engaged to meeting the foreign debt obligation from its limited international reserves. The impossibility of this being done leads to default or restructuring. In both cases, we have a credit event according to the ISDA classification. The foreign debt has a senior status. The spread that arises reflects the credit risk of the country. It is a function of: (1) the probability of the credit event to occur; (2) the expected loss given default; (3) the risk aversion of the market participants to the credit event.

The domestic debt economically stands differently. It reflects the priority of the payments in hard currency and it incurs instantly the losses in case of default of the country. So this debt is the first to be affected by a default and is subordinated. Technically, the credit event can be avoided under a flexible exchange rate regime because the country can always make a debt monetization and pay the amounts due in local currency taking advantage of the fact that there is no resource constraint on it. However, the price for this is inflation pick-up and exchange rate depreciation. This leads to real devaluation of the domestic debt. It is exactly the seigniorage and the dilution effect that cause the loss in the value.<sup>2</sup> This resembles the case of a firm issuing more equity to avoid default. The spread of the domestic debt over the foreign one forms the currency spread. Its nature is very broad, and it is not only due to the currency mismatch. Namely, it is a function of: (1) the probability of the credit event to occur and the need for monetization; (2) the negative side effect of the credit event on the exchange rate by a sudden depreciation of the latter; (3) the volatility of the exchange rate; (4) the expected depreciation of the exchange rate without taking into consideration the monetization; (5) the risk aversion of market participants to the credit event and the need for monetization, the sudden exchange rate depreciation and its size; (6) the risk aversion of the market participants to the volatility of the exchange rate. All these effects are captured by our model.

#### **2.2.2 Multi-currency Risky Bonds Model**

We use the setting of Sect. 2.1 modified to a multi-currency debt. Firstly, we consider the case of no monetization and then analyze the case with monetization. Secondly, to avoid using an additional marked point process, and thus a second intensity, the default on the foreign debt is modeled indirectly. Namely, we assume that default on domestic debt leads to default on foreign debt, but due to the different priority of the two, we have just different losses incurred, respectively recoveries. This means that by controlling recoveries we control default and the inherent subordination without imposing too much structure. If the default on the domestic debt is so strong that it leads to a default on the foreign debt as well, we incur zero recovery on the domestic debt and some positive one on the foreign debt. If the insolvency is mild, we have a loss only on the domestic debt, so we incur some positive recovery on the domestic debt and a full recovery on the foreign debt. Thirdly, for notational purposes, we take as a benchmark Germany and EUR as the base hard currency. Lastly, we employ the recovery of market value assumption. The reason for this is twofold. On one hand, in that way, we are consistent with the HJM methodology of Schönbucher [12] for a single risky curve under RMV and produce parsimonious no-arbitrage conditions for the extension to a multi-curve environment. On the other

<sup>2</sup>This pattern can be observed historically for almost all EM countries resorting to a galloping inflation to avoid a nominal domestic debt default. The Russian default of 1998 somehow seems to be a partial notable exception where there was along with the inflation surge an actual default on certain ruble (*RU R*) bonds—GKOs and OFZs.

**Fig. 1** Risky spreads

hand, as pointed out in Bonnaud et al. [5], for bonds denominated in a different currency than the numerator employed in discounting, the RMV assumption should be the working engine. Their argument is exactly as ours above, in case of default, the sovereign would rather dilute by depreciating the exchange rate and thus the remaining cash flows of the bond produce in essence the RMV structure. Moreover, rather than using EUR denominated bonds, we could take advantage of the CDS quotes and produce synthetic bonds having an RMV recovery structure. Using them is actually preferable for empirical work since major academic studies argue that it is the CDS market that first captures the market information about the credit risk stance of the risky sovereign. Furthermore, with a few exceptions, if the EM sovereigns have in most cases both well developed local currency treasury markets and are subject to CDS quotation, they do have only few Eurobonds outstanding. Figure 1 represents the typical situation the risky sovereign faces.

*Mathematical formulation* We continue with the model setup. Firstly, we give the suitable notation and assumptions. Then we move to the derivation of the no-arbitrage conditions and the pricing.

# • **Notation**

*fEUR*(*t*, *T* )—nominal forward rate, EUR, Ger. *f* ∗ *EUR*(*t*, *T* )—nominal forward rate, EUR, EM *f* ∗ *LC*(*t*, *T* )—nominal forward rate in LC, EM *rEUR*(*t*)—nominal short rate, EUR, Ger. *r* ∗ *EUR*(*t*)—nominal short rate, EUR, EM *r* ∗ *LC*(*t*)—nominal short rate in LC, EM *h*∗ *EUR*(*t*, *T* ) = *f* <sup>∗</sup> *EUR*(*t*, *T* ) − *fEUR*(*t*, *T* )—credit spr., EM *h*∗ *LC*,*EUR*(*t*, *T* ) = *f* <sup>∗</sup> *LC*(*t*, *T* ) − *f* <sup>∗</sup> *EUR*(*t*, *T* )—currency spr., EM *h*∗ *LC*(*t*, *T* ) = *f* <sup>∗</sup> *LC*(*t*, *T* ) − *fEUR*(*t*, *T* )—general currency spr., EM *PEUR*(*t*, *<sup>T</sup>* ) <sup>=</sup> exp(<sup>−</sup> *<sup>T</sup> <sup>t</sup> fEUR*(*t*,*s*)*ds*)—bond, EUR, Ger. *P*<sup>∗</sup> *<sup>f</sup>*,*EUR*(*t*, *<sup>T</sup>* ) <sup>=</sup> *<sup>R</sup> <sup>f</sup>*,*EUR*(*t*) exp(<sup>−</sup> *<sup>T</sup> <sup>t</sup> f* <sup>∗</sup> *EUR*(*t*,*s*)*ds*)—for. bond price., EUR, EM *P*<sup>∗</sup> *<sup>d</sup>*,*LC*(*t*, *<sup>T</sup>* ) <sup>=</sup> *Rd*,*LC*(*t*) exp(<sup>−</sup> *<sup>T</sup> <sup>t</sup> f* <sup>∗</sup> *LC*(*t*,*s*)*ds*)—dom. bond price., LC, EM *BEUR*(*t*) = exp( *t* <sup>0</sup> *rEUR*(*s*)*ds*)—bank account, EUR, Ger. *B*<sup>∗</sup> *<sup>f</sup>*,*EUR*(*t*) = *R <sup>f</sup>*,*EUR*(*t*) exp( *t* <sup>0</sup> *r* <sup>∗</sup> *EUR*(*s*)*ds*)—for. bank account, EUR, EM *B*∗ *<sup>d</sup>*,*LC*(*t*) = *Rd*,*LC*(*t*) exp( *t* <sup>0</sup> *r* <sup>∗</sup> *LC*(*s*)*ds*)—dom. bank account, LC, EM *<sup>X</sup>*(*t*)—exchange rate, EUR for 1 LC, *<sup>X</sup>*(*t*)—exchange rate, LC for 1 EUR *R <sup>f</sup>*,*EUR*(*t*)—bond recovery, EUR, EM, *Rd*,*LC*(*t*)—bond recovery, LC, EM

We use the asterisk to denote risk, the first letter (*d* or *f* ) to denote domestic or foreign debt, and finally the currency of denomination is shown as *EUR* or LC.<sup>3</sup>

# • **Currency denominations**

*P*∗ *<sup>d</sup>*,*EUR*(*t*, *T* ) = *X*(*t*)*P*<sup>∗</sup> *<sup>d</sup>*,*LC*(*t*, *T* )—dom. bond, EUR *P*∗ *<sup>f</sup>*,*LC*(*t*, *<sup>T</sup>* ) <sup>=</sup> *<sup>X</sup>*(*t*)*P*<sup>∗</sup> *<sup>f</sup>*,*EUR*(*t*, *T* )—for. bond, LC *B*∗ *<sup>d</sup>*,*EUR*(*t*) = *X*(*t*)*B*<sup>∗</sup> *<sup>d</sup>*,*LC*(*t*)—dom. bank account, EUR *B*∗ *<sup>f</sup>*,*LC*(*t*) <sup>=</sup> *<sup>X</sup>*(*t*)*B*<sup>∗</sup> *<sup>f</sup>*,*EUR*(*t*)—for. bank account, LC

• **Intensities**

Foreign debt, EUR: Intensity: *hEUR*(*t*) = *h*(*t*) Compensator: *hEUR*(*t*)*qe*,*EUR*(*t*) = *h*(*t*) *<sup>E</sup> q <sup>f</sup>*,*EUR* (ω;*t*, *x*) *Ft*(*dx*) Domestic debt, LC: Intensity: *hLC*(*t*) = *h*(*t*) Compensator: *hLC*(*t*)*qe*,*LC*(*t*) = *h*(*t*) *<sup>E</sup> qd*,*LC* (ω;*t*, *x*) *Ft*(*dx*)

The compensator (generalized intensity) characterizes default. Controlling in a suitable way the recovery, we can control the compensator and thus the default event. We turn attention now to the dynamics of the instruments under consideration.

# • **Forward rates**

*d fEUR*(*t*, *<sup>T</sup>* ) <sup>=</sup> <sup>α</sup>*EUR*(*t*, *<sup>T</sup>* )*dt* <sup>+</sup> *<sup>n</sup> <sup>i</sup>*=<sup>1</sup> <sup>σ</sup>*EUR*,*<sup>i</sup>*(*t*, *<sup>T</sup>* )*dW <sup>P</sup> <sup>i</sup>* (*t*) *d f* <sup>∗</sup> *EUR*(*t*, *T* ) = α<sup>∗</sup> *EUR*(*t*, *<sup>T</sup>* )*dt* <sup>+</sup> *<sup>n</sup> <sup>i</sup>*=<sup>1</sup> σ<sup>∗</sup> *EUR*,*<sup>i</sup>*(*t*, *T* )*dW <sup>P</sup> <sup>i</sup>* (*t*) + *<sup>E</sup>* δ<sup>∗</sup> *EUR*(*x*, *t*, *T* )μ(*dx*, *dt*)

<sup>3</sup>It must be further noted that we actually used standard definitions for the risky forward rates as in Schönbucher [12]. Namely, *f* ∗ *EUR*/*LC*(*t*, *<sup>T</sup>* ) = − <sup>∂</sup> <sup>∂</sup>*<sup>T</sup>* log *P*<sup>∗</sup> *<sup>f</sup>*,*EUR*/*d*,*LC*(*t*, *T* ) with terminal conditions *P*∗ *<sup>f</sup>*,*EUR*/*d*,*LC*(*T*, *T* ) = *R <sup>f</sup>*,*EUR*/*d*,*LC*(*T* ). The risky bank accounts economically just represent a unit of currency invested at the respective short rates and continuously rolled over accounting for any default losses. However, since the forward rates, resp. the bonds, are our basic modeling object, it would be more precise to consider the bank accounts derived quantities from them similar to Björk et al. [4] without going here deeper into the modified technical details.

$$df\_{LC}^\*(t,T) = \alpha\_{LC}^\*(t,T)dt + \sum\_{i=1}^n \sigma\_{LC,i}^\*(t,T)dW\_i^P(t)\\ + \int\_E \delta\_{LC}^\*(x,t,T)\mu(dx,dt)\\ \dots \dots \dots \dots \dots \dots$$

We assume that in case of default there is a market turmoil leading to a jump in both curves. At maturity *T* , the *EUR* curve jumps by a size of *<sup>E</sup>* δ<sup>∗</sup> *EUR*(*x*, *t*, *T* )μ (*dx*, *dt*), and that of the local currency by *<sup>E</sup>* δ<sup>∗</sup> *LC*(*x*, *t*, *T* )μ(*dx*, *dt*). The terms δ∗ *EUR*(*x*, *t*, *T* ) and δ<sup>∗</sup> *LC*(*x*, *t*, *T* ) show the jump sizes of the respective curves for every maturity. As indicated at the beginning of the section, everywhere we will work under the market filtration *Gt* so both the Brownian motions and the point process are adapted to it.

# • **Recoveries**

$$\frac{d\boldsymbol{R}\_{f,EUR}(t)}{\boldsymbol{R}\_{f,EUR}(t)} = -\int\_{E} \boldsymbol{q}\_{f,EUR}(\boldsymbol{x},t)\mu(d\boldsymbol{x},dt)$$
 
$$\frac{d\boldsymbol{R}\_{d,LC}(t)}{\boldsymbol{R}\_{d,LC}(t)} = -\int\_{E} \boldsymbol{q}\_{d,LC}(\boldsymbol{x},t)\mu(d\boldsymbol{x},dt)$$

After each default we have a devaluation of the respective bond by an expected value of *<sup>E</sup> q <sup>f</sup>*/*<sup>d</sup>* (*x*, *t*)μ(*dx*, *dt*). The stochasticity of the loss is captured by the random jump size *q*(., .) as elaborated in Sect. 2.1.

# • **Bank accounts**

$$\begin{aligned} \frac{d B\_{EUR}(t)}{B\_{EUR}(t)} &= r\_{EUR}(t)dt \\ \frac{d B\_{f,EUR}^\*(t)}{B\_{f,EUR}^\*(t)} &= r\_{EUR}^\*(t)dt - \int\_E q\_{f,EUR}(\mathbf{x}, t)\mu(d\mathbf{x}, dt) \\\\ \frac{d B\_{d,C}^\*(t)}{B\_{d,C}^\*(t)} &= r\_{LC}^\*(t)dt - \int\_E q\_{d,LC}(\mathbf{x}, t)\mu(d\mathbf{x}, dt) \end{aligned}$$

• **Exchange rate**

*d X*(*t*) *<sup>X</sup>*(*t*) <sup>=</sup> <sup>α</sup>*<sup>X</sup>* (*t*)*dt* <sup>+</sup> *<sup>n</sup> <sup>i</sup>*=<sup>1</sup> <sup>σ</sup>*<sup>X</sup>*,*<sup>i</sup>*(*t*)*dW <sup>P</sup> <sup>i</sup>* (*t*) − *<sup>E</sup>* δ*<sup>X</sup>* (*x*, *t*)μ(*dx*, *dt*) We assume that in case of default the market turmoil causes an exchange rate devaluation by *<sup>E</sup>* δ*<sup>X</sup>* (*x*, *t*)μ(*dx*, *dt*).

• **Bond prices**

$$\begin{split} &P\_{EUR}(t,T) = \exp(-\int\_{t}^{T} f\_{EUR}(t,s)ds) = E^{Q'} \left( \exp(-\int\_{t}^{T} r\_{EUR}(s)ds) | G\_{t} \right) \\ &P\_{f,EUR}^{\*}(t,T) = R\_{f,EUR}(t) \exp(-\int\_{t}^{T} f\_{EUR}^{\*}(t,s)ds) \\ &= E^{Q'} \left( \exp(-\int\_{t}^{T} r\_{EUR}(s)ds) R\_{f,EUR}(T) | G\_{t} \right) \\ &P\_{d,EUR}^{\*}(t,T) = P\_{d,LC}^{\*}(t,T)X(t) = R\_{d,LC}(t)X(t) \exp(-\int\_{t}^{T} f\_{LC}^{\*}(t,s)ds) \\ &= E^{Q'} \left( \exp(-\int\_{t}^{T} r\_{EUR}(s)ds) R\_{d,LC}(T)X(T) | G\_{t} \right) \end{split}$$

It must be emphasized that the effects of exchange rate, recovery, and the expected devaluation sizes are incorporated in the respective forward rates of the bonds. Furthermore, the expectations are taken under *Q <sup>f</sup>* , the foreign risk-neutral measure.

# • **Arbitrage**

Under standard regularity conditions, for the system to be free of arbitrage, all traded assets denominated in euro must have a rate of return *rEUR* under *Q <sup>f</sup>* . This means that the processes:

$$\frac{P\_{EUR}(t,T)}{B\_{EUR}(t)}, \frac{B^\*\_{f,EUR}(t)}{B\_{EUR}(t)}, \frac{P^\*\_{f,EUR}(t,T)}{B\_{EUR}(t)}, \frac{B^\*\_{d,LC}(t)X(t)}{B\_{EUR}(t)}, \frac{P^\*\_{d,LC}(t,T)X(t)}{B\_{EUR}(t)}$$

must be local martingales under *Q <sup>f</sup>* . For our purposes being martingales would be enough.

Taking the stochastic differentials of the upper expressions, omitting the technicalities to the appendix, we can get the respective no-arbitrage conditions.

• Spreads:

$$r\_{EUR}^\*(t) - r\_{EUR}(t) = h(t)\varphi\_{q\_{f,EUR}}(t)\tag{5.1}$$

$$\begin{aligned} r\_{LC}^\*(t) - r\_{EUR}^\*(t) &= -\alpha\_X(t) - \phi(t)\sigma\_X(t) \\ + h(t)(\varphi\_{\delta\_X}(t) - \varphi\_{q\_{d,LC},\delta\_X}(t) + \varphi\_{q\_{d,LC}}(t) - \varphi\_{q\_{f,EUR}}(t)) \end{aligned} \tag{5.2}$$

• Drifts:

$$\begin{split} & \alpha\_{EUR}(t,T) = \sigma\_{EUR}(t,T) \int\_{t}^{T} \sigma\_{EUR}(t,v) dv - \sigma\_{EUR}(t,T) \phi(t) \\ & \alpha\_{EUR}^{\*}(t,T) = \sigma\_{EUR}^{\*}(t,T) \int\_{t}^{T} \sigma\_{EUR}^{\*}(t,v) dv - \sigma\_{EUR}^{\*}(t,T) \phi(t) \\ & + h\_{EUR}(t) \varphi\_{\theta\_{EUR}^{\*}}^{q\_{f,EUR,\delta\_{X}}^{q\_{f,EUR,\delta\_{X}}}(t) \\ & \alpha\_{LC}^{\*}(t,T) = \sigma\_{LC}^{\*}(t,T) \int\_{t}^{T} \sigma\_{LC}^{\*}(t,v) dv - \sigma\_{LC}^{\*}(t,T) \phi(t) - \sigma\_{LC}^{\*}(t,T) \sigma\_{X}(t,T) \\ & + h\_{LC}(t) \varphi\_{\theta\_{Lc}^{\*}}^{q\_{LLC,\delta\_{X}},\delta\_{X}}(t), \end{split}$$

where we have used the notation:

 $\theta\_{EUR}^{\*} = \exp(-\int\_{t}^{T} \delta\_{EUR}^{\*}(\mathbf{x}, t, \mathbf{s}) d\mathbf{s})$   $, \theta\_{LC}^{\*} = \exp(-\int\_{t}^{T} \delta\_{LC}^{\*}(\mathbf{x}, t, \mathbf{s}) d\mathbf{s})$   $\varphi\_{a, b, \dots}^{\mathbf{x}, \mathbf{y}, \dots}(t) = \int\_{E} (ab \dots)((1 - \mathbf{x})(1 - \mathbf{y}) \dots) \Phi(t, \mathbf{x}) F\_{l}(d\mathbf{x})$ 

and used vector notation and scalar products where necessary for simplicity.

By Φ(*t*, *x*) and φ(*t*) we denoted the Girsanov's kernels of the counting process and the Brownian motion respectively when changing the probability measure from *P* to *Q <sup>f</sup>* . The term ϕ(*t*) represents the scaled expected jump sizes of the counting process. We can give the interpretation that φ(*t*) is the market price of diffusion risk and ϕ(*t*) is the market price of jump risk. Parametrizing the volatilities and the market prices of risk, as well as imposing suitable dynamics for *h*(*t*), we give a full characterization of our system. Furthermore, the intensity could be a function of the underlying processes of the rates, so we could get correlation between the intensity, the interest rates, and the exchange rate.

*Spreads diagnostics from a reduced form point of view* It is important to give a deeper interpretation of the no-arbitrage conditions and see which factors drive the credit and currency spreads. Despite the heavy notation, the analysis actually goes fluently. The drift equations give the modified HJM drift restrictions. The slight change from the classical riskless case is due to the jumps that arise. Equation (5.1) shows that the credit risk is proportional to the intensity of default and the scaled expected LGD by the coefficient controlling the risk aversion. The higher they are, the higher the spread is. Equation (5.2) gives the currency spread. It arises due to two main reasons. Firstly, the intensity of default and the difference between the two *LGDs* in local currency and euro, scaled by the coefficient for the risk aversion, act as in the previous case. They also make explicit the subordination. Secondly, the expected local currency depreciation, its volatility, and the risk aversion to diffusion risk act similarly to the standard uncovered interest parity (UIP) relationship. The higher they are, the higher the spread is. It is both important and interesting to note that inflation does not appear directly and it influences the spreads, as the next section shows, only through a secondary channel.

*Monetization* The analysis so far considered a loss of 1 − *Rd*,*LC*(*T* ) on default of the domestic debt. However, if a full monetization is applied, then we would have *Rd*,*LC*(*T* ) = 1 and thus ϕ*qd*,*LC* (*t*) = 0 and ϕ*qd*,*LC* ,δ*<sup>X</sup>* (*t*) = 0. If such a monetary injection is neutral to nominal values, it is certainly not to real ones. Devaluation arises due to the negative market sentiment following the default and the higher amount of money in circulation. Its effect can be measured differently based on what we take as a base—the price index or the exchange rate. Most naturally, we can expect both of them to depreciate due to the structural macrolinks that exist between these variables. For quantifying the amount we would need a macromodel which is beyond the scope of the reduced form model presented. The latter only shows what characteristics the market prices in general without imposing concrete macrolinks among them. Depending on what the base is, we would have a direct estimation of certain type of indicators and an indirect one of the rest up to their structural influence on the former. If the inflation is taken as a base, then we would have the comparison of inflation indexed bonds to the non-indexed ones. The spread between them would give an estimate for the expected inflation. Unfortunately, such an analysis is unrealistic due to the fact that such bonds are issued very rarely by emerging market countries. If the exchange rate is taken as a base, then we would have the comparison of domestic debt bonds to foreign debt bonds. The spread between them would give an estimate for the currency risk and the devaluation effect. The estimate for the inflation would be indirect and based on hypothetical structural links.

Whether the country would monetize or declare a formal default is based on strategic considerations. It is a matter of structural analysis which option it would take. By all means, its decision is priced. In case of default, the pricing formula is Eq. (5.2). In case of monetization, we would have a jump in the exchange rate. Let us denote its size by <sup>δ</sup>*<sup>X</sup>* . It will be different from the no-monetization one, <sup>δ</sup>*<sup>X</sup>* , due to the different regimes that are followed, and we would thus get:

$$r\_{LC}^{\*}(t) - r\_{EUR}^{\*}(t) = h(t)(\varphi\_{\hat{\delta}\chi}^{\*}(t) - \varphi\_{q\_{f,EUR}}(t)) - \alpha\_X(t) - \phi(t)\sigma\_X(t) \tag{6}$$

There is no a priori no-arbitrage argument that <sup>ϕ</sup> <sup>δ</sup>*<sup>X</sup>* (*t*) <sup>=</sup> ϕδ*<sup>X</sup>* (*t*) <sup>−</sup> <sup>ϕ</sup>*qd*,*LC* ,δ*<sup>X</sup>* (*t*) <sup>+</sup> ϕ*qd*,*LC* (*t*) must hold so that the two scenarios are equivalent.4 The only information we get from the market is an estimate for the generalized intensity being *<sup>h</sup>*(*t*)<sup>ϕ</sup> <sup>δ</sup>*<sup>X</sup>* (*t*) or *h*(*t*)(ϕδ*<sup>X</sup>* (*t*) − ϕ*qd*,*LC* ,δ*<sup>X</sup>* (*t*) + ϕ*qd*,*LC* (*t*)) but not knowing which possible scenario will be realized.

# **3 CDS-Bond Basis**

# *3.1 General Notes*

The setting we built gives us an alternative for evaluating the CDS-Bond basis. This is represented in Fig. 1. There the LC zero-coupon yield curve is built by employing local currency treasuries and an appropriate smoothing method. The EUR zerocoupon yield curve is built by employing CDS quotes with the maths represented in the sequel. Along with the curves, there are few Eurobonds represented in light blue colored dots. Both credit and currency spreads can be computed for them employing a standard Z-spread methodology. Despite its various shortcomings, as discussed in Berd et al. [2] and Elizalde et al. [10], it allows us to have a certain measure for the spreads and it is widely accepted by practitioners. Subtracting from the yield curves' implied credit and currency spreads the bond implied spreads, we get two alternative specifications for the CDS-Bonds basis. Several things need a comment.

Firstly, the two basis measures are not equal by default. The one representing the credit spread is subject to Z-spread measurement based on a parallel shift of the benchmark curve. So it depends on the whole benchmark curve and has nothing to do with the LC one. Vice versa, the basis implied by the LC curve is subject to Z-spread measurement based on a parallel shift of the LC curve. So it depends on the whole *LC* curve, but has nothing to do with the benchmark one. This provides intuition how the introduction of the LC curve brings additional information in the picture and provides more market completeness that must be utilized in relative value trades.

Secondly, as mentioned above, the EUR curve is built by utilizing CDS quotes. As shown below, in the procedure employed, an assumption is needed for the recovery scheme. What it should be depends on our purposes. On one hand, if we would like to just extract the credit and currency spreads from the yield curves and calibrate a reduced form model,<sup>5</sup> it would be convenient to employ the setting from Sect. 2. So

<sup>4</sup>This is a delicate issue. As indicated, a further structural analysis is needed for a complete answer. The crucial point is that the two scenarios affect in a different way the monetary base. It will have a neutral effect on the macro variables in general and the risky spreads in particular only in case the economy is at the macro potential. Exactly when that is not the case, we can expect that the two scenarios will not be equivalent. A further elaboration on these issues from a structural point of view could be found in Yordanov [15, 16].

<sup>5</sup>We postpone the factors to build realization of the model from Sect. 2 so that it becomes operative for calibration and consequent further analysis to the forthcoming follow-up paper of Yordanov [17].

an RMV assumption for the EUR curve is the most appropriate one since the same assumption is imposed also for the LC curve and when subtracting the corresponding zero yields, we subtract apples from apples. On the other hand, if we like to extract the basis, we must be careful since the Eurobonds are priced under a firmly established RP assumption. So for a standard calculation via a Z-spread based on the benchmark curve we need an RP built EUR curve to be consistent. With the many problems of the Z-spread, it would be definitely bad to add further ones coming from a recovery assumption inconsistency which would only further contribute to an imprecise basis measurement. For a calculation via a Z-spread based on the LC curve, we should not use the RMV LC curve but a modified one. From the RMV LC curve we need to build an RP one and then compute the Z-spread and the basis to be consistent.

# *3.2 Technical Notes*

Here we provide the technical notes regarding the above discussion.

# • **EUR curve**

Using OIS differential discounting as in Doctor and Goulden [6], we could modify6 the standard CDS bootstrap procedure of ISDA and extract at time *t* the *T*−maturity default probabilities *p<sup>R</sup> EUR*(*t*, *T* ) under a recovery assumption of *R*. Then we would get in a straightforward way the *EUR* zero coupon yields (*ytm*) and credit spreads (*spr*) under RMV and RP:

$$-\mathbf{RMV} \colon$$

$$\begin{aligned} \text{spr}\_{EUR}^{RMV,R}(t,T) &= -\frac{(1-\mathbb{R})\log(1-p\_{EUR}^{R}(t,T))}{T-t} \\ \text{sptm}\_{EUR}^{RMV,R}(t,T) &= \text{spr}\_{EUR}^{RMV,R}(t,T) + \exp(-\chi\_{EUR}(t,T)(T-t)) \end{aligned}$$

– **RP**:

$$\begin{aligned} \text{spr}\_{EUR}^{RP,R}(t,T) &= -\frac{\log(R p\_{EUR}^R(t,T) + 1 - p\_{EUR}^R(t,T))}{T - t} \\ \text{ytm}\_{EUR}^{RP,R}(t,T) &= \text{spr}\_{EUR}^{RP,R}(t,T) + \exp(-\text{y}\_{EUR}(t,T)(T - t)), \end{aligned}$$

where *yEUR*(*t*, *T* ) is the *T*−maturity zero yield of the riskless benchmark curve (e.g. German bunds).

	- **RMV**:

ytm*RMV*,*<sup>R</sup> LC* (*t*, *T* )–observed from the market

<sup>6</sup>The OIS discounting should be given a special comment since there is still no consensus on how to bootstrap OIS swaps to form the discount factors for the CDS swap bootstrap. The problem comes from the presence of gaps for certain maturities. A possible specification is given in West [14].

$$\begin{aligned} \text{spr}\_{LC,EUR}^{RMV,R}(t,T) &= \text{ytm}\_{LC}^{RMV,R}(t,T) - \text{ytm}\_{EUR}^{RMV,R}(t,T) \\ p\_{LC}^{R}(t,T) &= 1 - \exp(-\frac{\text{spr}\_{LC,EUR}^{RMV,R}(t,T)}{1 - R}(T - t)) \\ - \text{RP}; \\ \text{spr}\_{LC,EUR}^{RP,R}(t,T) &= -\frac{\log(Rp\_{LC}^{R}(t,T) + 1 - p\_{LC}^{R}(t,T))}{T - t} \\ \text{ytm}\_{LC}^{RP,R}(t,T) &= \text{spr}\_{LC,EUR}^{RP,R}(t,T) + \text{ytm}\_{EUR}^{RP,R}(t,T) \end{aligned}$$

Note that similarly to the EUR curve procedure, the LC curve one relies on the premise that both the RMV and RP cases must share the same *p<sup>R</sup> LC*(*t*, *T* ), which stands for the probability of default on the LC debt. However, according to the analysis we had in Sect. 2 on the no-arbitrage conditions, due to the monetization, such probability actually does not formally exist. Here it is only a derived quantity since although we assume the same point process as a driver of default on both the LC and EUR debt, we can control the compensator by changing the recoveries. However, we could just take the formulas above for the RP spread as definitions. Taking the limit case of zero EUR debt, they would be entirely consistent to the RP in case of EUR debt, thus providing a justification for our method.

# *3.3 CDS-Bond Basis Empirics*

For illustration we provide visualization of the Z-spread measured basis according to the two alternative ways for a set of European EM countries. They are chosen so that they have both Eurobonds outstanding in EUR and a liquid LC curve. The data sources are: Bloomberg, Datastream, and CBonds. We build the LC curves by employing the Bloomberg BFV curves. Since they are par curves, see Lee [11], we transform them to zero-coupon yield ones. For spreads extraction we use both EUR and USD denominated CDS. We give preference to the former, but in case of missing quotes we use USD quotes instead by making a quanto adjustment using cross currency basis swaps. The countries under focus are: Bulgaria (BGN), Czech Rep. (CZK), Hungary (HUF), Lithuania (LTL), Poland (PLN), Romania (RON), and Slovakia (SKK).

Since there are plenty of bonds outstanding, aggregate measures are presented based on duration weighting. The events: 1—GM turmoil of May 09, 2005, 2— Liquidity crisis of August 09, 2007, 3—Bear Sterns default of March 14, 2008, 4—Lehman default of September 15, 2008, 5—Greek turmoil of April 23, 2010, 6—August 5, 2011—the US rating downgrade, 7—06 May, 2012—ECB refi-rate woes are marked by the vertical dashed lines.

**Fig. 2** CDS-Bond basis across countries

The short conclusion from the patterns in Fig. 2 is that the bonds provide important input for extracting the credit and currency spreads. The two alternative basis formulations preserve general shape similarity, but still give different results that should not be underestimated. This is not surprising since the outcome is driven by the difference in shapes between the benchmark and the LC curves. Market strategists and arbitrage traders have a large scope for interpretations and trades design.

# **4 Conclusion**

The paper considers the credit and currency spreads of a risky EM country. The necessary no-arbitrage conditions are derived and their informational content is analyzed. An application of the setting is made to proper building of the foreign and local currency yield curves of a sovereign as well as to providing ideas for relative value diagnostics in a multi-currency framework. In that direction, an alternative measure for the CDS-Bond basis is discussed when the local currency curve is employed as a pillar. The aim of the paper is both to point out the rich opportunities the setting gives for market-related research that could be of use to strategists and policy officers and to make the first several steps toward investigating such opportunities.

**Acknowledgements** The KPMG Center of Excellence in Risk Management is acknowledged for organizing the conference "Challenges in Derivatives Markets - Fixed Income Modeling, Valuation Adjustments, Risk Management, and Regulation".

# **Appendix**

Here we briefly elaborate on the derivation of Eqs. (5.1) and (5.2). Applying the Girsanov's theorem and the Ito's Lemma for jump diffusions to Eq. (2), we get the dynamics:

$$\begin{split} \frac{dP\_{f,EUR}^{\ast}(t,T)}{P\_{f,EUR}^{\ast}(t,T)} &= \left( -\int\_{t}^{T} \alpha\_{EUR}^{\ast}(t,s)ds + r\_{EUR}^{\ast}(t) + \frac{1}{2}||\int\_{t}^{T} \sigma\_{EUR}^{\ast}(t,s)ds||^{2} \right) dt \\ &- \left( \int\_{t}^{T} \sigma\_{EUR}^{\ast}(t,s)ds \right) dW^{P}(t) \\ &+ \int\_{E} (1 - q\_{f,EUR}(x,t)) \left( \exp\left( -\int\_{t}^{T} \delta\_{EUR}^{\ast}(x,t,s)ds \right) - 1 \right) \mu(dx,dt) \\ &- \int\_{E} q\_{f,EUR}(x,t)\mu(dx,dt) \end{split}$$

$$\begin{split} \frac{d P\_{d,LC}^{\*}(t,T)}{P\_{d,LC}^{\*}(t,T)} &= \left( -\int\_{t}^{T} \alpha\_{LC}^{\*}(t,s)ds + r\_{LC}^{\*}(t) + \frac{1}{2}||\int\_{t}^{T} \sigma\_{LC}^{\*}(t,s)ds||^{2} \right) dt \\ &- \left( \int\_{t}^{T} \sigma\_{LC}^{\*}(t,s)ds \right) dW^{P}(t) \\ &+ \int\_{E} (1 - q\_{d,LC}(\mathbf{x},t)) \left( \exp\left( -\int\_{t}^{T} \delta\_{LC}^{\*}(\mathbf{x},t,s)ds \right) - 1 \right) \mu(d\mathbf{x},dt) \\ &- \int\_{E} q\_{d,LC}(\mathbf{x},t)\mu(d\mathbf{x},dt) \end{split}$$

Furthermore, we have the dynamics of the exchange rate:

$$\frac{dX(t)}{X(t)} = \alpha\_X(t)dt + \sum\_{i=1}^n \sigma\_{X,i}(t)dW\_i^P(t) - \int\_E \delta\_X(\mathbf{x}, t)\mu(d\mathbf{x}, dt),$$

So using the no-arbitrage conditions and equating the expected local drifts to the risk-free rate, we get the results shown.

**Open Access** This chapter is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, duplication, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, a link is provided to the Creative Commons license and any changes made are indicated.

The images or other third party material in this chapter are included in the work's Creative Commons license, unless indicated otherwise in the credit line; if such material is not included in the work's Creative Commons license and the respective action is not permitted by statutory regulation, users will need to obtain permission from the license holder to duplicate, adapt or reproduce the material.

# **References**


# **Part III Financial Engineering**

# **Basket Option Pricing and Implied Correlation in a One-Factor Lévy Model**

**Daniël Linders and Wim Schoutens**

**Abstract** In this paper we employ a one-factor Lévy model to determine basket option prices. More precisely, basket option prices are determined by replacing the distribution of the real basket with an appropriate approximation. For the approximate basket we determine the underlying characteristic function and hence we can derive the related basket option prices by using the Carr–Madan formula. We consider a three-moments-matching method. Numerical examples illustrate the accuracy of our approximations; several Lévy models are calibrated to market data and basket option prices are determined. In the last part we show how our newly designed basket option pricing formula can be used to define implied Lévy correlation by matching model and market prices for basket options. Our main finding is that the implied Lévy correlation smile is flatter than its Gaussian counterpart. Furthermore, if (near) atthe-money option prices are used, the corresponding implied Gaussian correlation estimate is a good proxy for the implied Lévy correlation.

**Keywords** Basket option · Implied correlation · One-factor Lévy model · Variance-Gamma

# **1 Introduction**

Nowadays, an increased volume of multi-asset derivatives is traded. An example of such a derivative is a *basket option*. The basic version of such a multivariate product has the same characteristics as a vanilla option, but now the underlying is a basket of stocks instead of a single stock. The pricing of these derivatives is not a trivial task because it requires a model that jointly describes the stock prices involved.

D. Linders (B)

Faculty of Business and Economics, KU Leuven, Naamsestraat 69, 3000 Leuven, Belgium e-mail: daniel.linders@kuleuven.be

W. Schoutens Faculty of Science, KU Leuven, Celestijnenlaan 200, 3001 Heverlee, Belgium e-mail: Wim@Schoutens.be

Stock price models based on the lognormal model proposed in Black and Scholes [6] are popular choices from a computational point of view; however, they are not capable of capturing the skewness and kurtosis observed for log returns of stocks and indices. The class of Lévy processes provides a much better fit for the observed log returns and, consequently, the pricing of options and other derivatives in a Lévy setting is much more reliable. In this paper we consider the problem of pricing multi-asset derivatives in a multivariate Lévy model.

The most straightforward extension of the univariate Black and Scholes model is based on the *Gaussian copula model*, also called the multivariate Black and Scholes model. In this framework, the stocks composing the basket at a given point in time are assumed to be lognormally distributed and a Gaussian copula is connecting these marginals. Even in this simple setting, the price of a basket option is not given in a closed form and has to be approximated; see e.g. Hull and White [23], Brooks et al. [8], Milevsky and Posner [39], Rubinstein [42], Deelstra et al. [18], Carmona and Durrleman [12] and Linders [29], among others. However, the normality assumption for the marginals used in this pricing framework is too restrictive. Indeed, in Linders and Schoutens [30] it is shown that calibrating the Gaussian copula model to market data can lead to non-meaningful parameter values. This dysfunctioning of the Gaussian copula model is typically observed in distressed periods. In this paper we extend the classical Gaussian pricing framework in order to overcome this problem.

Several extensions of the Gaussian copula model are proposed in the literature. For example, Luciano and Schoutens [32] introduce a multivariate Variance Gamma model where dependence is modeled through a common jump component. This model was generalized in Semeraro [44], Luciano and Semeraro [33], and Guillaume [21]. A stochastic correlation model was considered in Fonseca et al. [19]. A framework for modeling dependence in finance using copulas was described in Cherubini et al. [14]. The pricing of basket options in these advanced multivariate stock price models is not a straightforward task. There are several attempts to derive closed form approximations for the price of a basket option in a non-Gaussian world. In Linders and Stassen [31], approximate basket option prices in a multivariate Variance Gamma model are derived, whereas Xu and Zheng [48, 49] consider a local volatility jump diffusion model. McWilliams [38] derives approximations for the basket option price in a stochastic delay model. Upper and lower bounds for basket option prices in a general class of stock price models with known joint characteristic function of the logreturns are derived in Caldana et al. [10].

In this paper we start from the one-factor Lévy model introduced in Albrecher et al. [1] to build a multivariate stock price model with correlated Lévy marginals. Stock prices are assumed to be driven by an idiosyncratic and a systematic factor. The idea of using a common market factor is not new in the literature and goes back to Vasicek [47]. Conditional on the common (or market) factor, the stock prices are independent. We show that our model generalizes the Gaussian model (with single correlation). Indeed, the idiosyncratic and systematic components are constructed from a Lévy process. Employing a Brownian motion in that construction delivers the Gaussian copula model, but other Lévy models arise by employing different Lévy processes like VG, NIG, Meixner, etc. As a result, this new one-factor Lévy model is more flexible and can capture other types of dependence.

The correlation is by construction always positive and, moreover, we assume a single correlation. Stocks can, in reality, be negatively correlated and correlations between different stocks will differ. From a tractability point of view, however, reporting a single correlation number is often preferred over *n*(*n* − 1)/2 pairwise correlations. The single correlation can be interpreted as a mean level of correlation and provides information about the general dependence among the stocks composing the basket. Such a single correlation appears, for example, in the construction of a correlation swap. Therefore, our framework may have applications in the pricing of such correlation products. Furthermore, calibrating a full correlation matrix may require an unrealistically large amount of data if the index consists of many stocks.

In the first part of this paper, we consider the problem of finding accurate approximations for the price of a basket option in the one-factor Lévy model. In order to value a basket option, the distribution of this basket has to be determined. However, the basket is a weighted sum of dependent stock prices and its distribution function is in general unknown or too complex to work with. Our valuation formula for the basket option is based on a moment-matching approximation. To be more precise, the (unknown) basket distribution is replaced by a shifted random variable having the same first three moments than the original basket. This idea was first proposed in Brigo et al. [7], where the Gaussian copula model was considered. Numerical examples illustrating the accuracy and the sensitivity of the approximation are provided.

In the second part of the paper we show how the well-established notions of implied volatility and implied correlation can be generalized in our multivariate Lévy model. We assume that a finite number of options, written on the basket and the components, are traded. The prices of these derivatives are observable and will be used to calibrate the parameters of our stock price model. An advantage of our modeling framework is that each stock is described by a volatility parameter and that the marginal parameters can be calibrated separately from the correlation parameter. We give numerical examples to show how to use the vanilla option curves to determine an implied Lévy volatility for each stock based on a Normal, VG, NIG, and Meixner process and determine basket option prices for different choices of the correlation parameter.

An *implied Lévy correlation* estimate arises when we tune the single correlation parameter such that the model price exactly hits the market price of a basket option for a given strike. We determine implied correlation levels for the stocks composing the Dow Jones Industrial Average in a Gaussian and a Variance Gamma setting. We observe that implied correlation depends on the strike and in the VG model, this implied Lévy correlation *smile* is flatter than in the Gaussian copula model. The standard technique to price non-traded basket options (or other multiasset derivatives) is by interpolating on the implied correlation curve. It is shown in Linders and Schoutens [30] that in the Gaussian copula model, this technique can sometimes lead to non-meaningful correlation values. We show that the Lévy version of the implied correlation solves this problem (at least to some extent). Several papers consider the problem of measuring implied correlation between stock prices; see e.g. Fonseca et al. [19], Tavin [46], Ballotta et al. [4], and Austing [2]. Our approach is different in that we determine implied correlation estimates in the onefactor Lévy model using multi-asset derivatives consisting of many assets (30 assets for the Dow Jones). When considering multi-asset derivatives with a low dimension, determining the model prices of these multi-asset derivatives becomes much more tractable. A related paper is Linders and Stassen [31], where the authors also use high-dimensional multi-asset derivative prices for calibrating a multivariate stock price model. However, whereas the current paper models the stock returns using correlated Lévy distributions, the cited paper uses time-changed Brownian motions with a common time change.

# **2 The One-Factor Lévy Model**

We consider a market where *n* stocks are traded. The price level of stock *j* at some future time *t*, 0 ≤ *t* ≤ *T* is denoted by *Sj*(*t*). Dividends are assumed to be paid continuously and the dividend yield of stock *j* is constant and deterministic over time. We denote this dividend yield by *qj*. The current time is *t* = 0. We fix a future time *T* and we always consider the random variables *Sj*(*T*) denoting the time-*T* prices of the different stocks involved. The price level of a basket of stocks at time *T* is denoted by *S*(*T*) and given by

$$S(T) = \sum\_{j=1}^{n} w\_j S\_j(T),$$

where *wj* > 0 are weights which are fixed upfront. In case the basket represents the price of the Dow Jones, the weights are all equal. If this single weight is denoted by *w*, then 1/*w* is referred to as the Dow Jones Divisor.<sup>1</sup> The pay-off of a basket option with strike *K* and maturity *T* is given by (*S*(*T*) − *K*)+, where (*x*)<sup>+</sup> = max(*x*, 0). The price of this basket option is denoted by *C*[*K*, *T*]. We assume that the market is arbitrage-free and that there exists a risk-neutral pricing measure Q such that the basket option price *C*[*K*, *T*] can be expressed as the discounted risk-neutral expected value. In this pricing formula, discounting is performed using the risk-free interest rate *r*, which is, for simplicity, assumed to be deterministic and constant over time. Throughout the paper, we always assume that all expectations we encounter are well-defined and finite.

<sup>1</sup>More information and the current value of the Dow Jones Divisor can be found here: http://www. djindexes.com.

# *2.1 The Model*

The most straightforward way to model dependent stock prices is to use a Black and Scholes model for the marginals and connect them with a Gaussian copula. A crucial (and simplifying) assumption in this approach is the normality assumption. It is well-known that log returns do not pass the test for normality. Indeed, log returns exhibit a skewed and leptokurtic distribution which cannot be captured by a normal distribution; see e.g. Schoutens [43].

We generalize the Gaussian copula approach by allowing the risk factors to be distributed according to any infinitely divisible distribution with known characteristic function. This larger class of distributions increases the flexibility to find a more realistic distribution for the log returns. In Albrecher et al. [1] a similar framework was considered for pricing CDO tranches; see also Baxter [5]. The Variance Gamma case was considered in Moosbrucker [40, 41], whereas Guillaume et al. [22] consider the pricing of CDO-squared tranches in this one-factor Lévy model. A unified approach for these CIID models (conditionally independent and identically distributed) is given in Mai et al. [36].

Consider an infinitely divisible distribution for which the characteristic function is denoted by φ. A stochastic process *X* can be built using this distribution. Such a process is called a Lévy process with mother distribution having the characteristic function φ. The Lévy process *X* = {*X*(*t*)|*t* ≥ 0} based on this infinitely divisible distribution starts at zero and has independent and stationary increments. Furthermore, for *s*, *t* ≥ 0 the characteristic function of the increment *X*(*t* + *s*) − *X*(*t*) is φ*<sup>s</sup>* .

Assume that the random variable *L* has an infinitely divisible distribution and denote its characteristic function by φ*L*. Consider the Lévy process *X* = {*X*(*t*)|*t* ∈ [0, 1]} based on the distribution *L*. We assume that the process is standardized, i.e. E[*X*(1)] = 0 and Var[*X*(1)] = 1. One can then show that Var[*X*(*t*)] = *t*, for *<sup>t</sup>* <sup>≥</sup> 0. Define also a series of independent and standardized processes *Xj* <sup>=</sup> *Xj*(*t*)|*t* ∈ [0, 1] , for *j* = 1, 2,..., *n*. The process *Xj* is based on an infinitely divisible distribution *Lj* with characteristic function φ*Lj* . Furthermore, the processes *X*1, *X*2,..., *Xn* are independent from *X*. Take ρ ∈ [0, 1]. The r.v. *Aj* is defined by

$$A\_j = X(\rho) + X\_j(1 - \rho), \quad j = 1, 2, \dots, n. \tag{1}$$

In this construction, *X*(ρ) and *Xj*(1 − ρ) are random variables having the characteristic function φ<sup>ρ</sup> *<sup>L</sup>* and <sup>φ</sup><sup>1</sup>−<sup>ρ</sup> *Lj* , respectively. Denote the characteristic function of *Aj* by φ*Aj* . Because the processes *X* and *Xj* are independent and standardized, we immediately find that

$$\mathbb{E}[A\_j] = 0, \quad \text{Var}[A\_j] = 1 \quad \text{and} \quad \phi\_{\mathbb{A}\_j}(t) = \phi\_L^\rho(t)\phi\_{L\_j}^{1-\rho}(t), \quad \text{for } j = 1, 2, \dots, n. \tag{2}$$

Note that if *X* and *Xj* are both Lévy processes based on the same mother distribution *L*, we obtain the equality *Aj* d = *L*.

The parameter ρ describes the correlation between *Ai* and *Aj*, if *i* = *j*. Indeed, it was proven in Albrecher et al. [1] that in case *Aj*, *j* = 1, 2,..., *n* is defined by (1), we have that

$$\text{Corr}\left[A\_i, A\_j\right] = \rho. \tag{3}$$

We model the stock price levels *Sj*(*T*) at time *T* for *j* = 1, 2,..., *n* as follows

$$S\_j(T) = S\_j(0) \mathbf{e}^{\mu\_j T + \sigma\_j \sqrt{T} A\_j}, \quad j = 1, 2, \dots, n,\tag{4}$$

whereμ*<sup>j</sup>* ∈ R and σ*<sup>j</sup>* > 0. Note that in this setting, each time-*T* stock price is modeled as the exponential of a Lévy process. Furthermore, a driftμ*<sup>j</sup>* and a volatility parameter σ*<sup>j</sup>* are added to match the characteristics of stock *j*. Our model, which we will call the one-factor Lévy model, can be considered as a generalization of the Gaussian model. Indeed, instead of a normal distribution, we allow for a Lévy distribution, while the Gaussian copula is generalized to a Lévy-based copula.2 This model can also, at least to some extent, be considered as a generalization to the multidimensional case of the model proposed in Corcuera et al. [17] and the parameter σ*<sup>j</sup>* in (4) can then be interpreted as the Lévy space (implied) volatility of stock *j*. The idea of building a multivariate asset model by taking a linear combination of a systematic and an idiosyncratic process can also be found in Kawai [26] and Ballotta and Bonfiglioli [3].

# *2.2 The Risk-Neutral Stock Price Processes*

If we take

$$
\mu\_j = (r - q\_j) - \frac{1}{T} \log \phi\_L \left( -\text{i}\sigma\_j \sqrt{T} \right), \tag{5}
$$

we find that

$$\mathbb{E}[\mathbf{S}\_j(T)] = \mathbf{e}^{(r-q)\cdot T}\mathbf{S}\_j(0), \quad j = 1, 2, \dots, n.$$

From expression (5) we conclude that the risk-neutral dynamics of the stocks in the one-factor Lévy model are given by

$$S\_j(T) = S\_j(0)\mathbf{e}^{(r-q\_j-\omega\_\flat)T+\sigma/\sqrt{T}\Lambda\_\flat}, \quad j=1,2,\ldots,n,\tag{6}$$

where ω*<sup>j</sup>* = log φ*<sup>L</sup>* −iσ*<sup>j</sup>* √*T* /*T*. We always assume ω*<sup>j</sup>* to be finite. The first three moments of *Sj*(*T*) can be expressed in terms of the characteristic function φ*Aj* . By

<sup>2</sup>The Lévy-based copula refers to the copula between the r.v.'s *A*1, *A*2,..., *An* and is different from the Lévy copula introduced in Kallsen and Tankov [25].

the martingale property, we have that E *Sj*(*T*) = *Sj*(0)e(*r*−*qj*)*<sup>T</sup>* . The risk-neutral variance Var *Sj*(*T*) can be written as follows

$$\text{Var}\left[S\_j(T)\right] = S\_j(0)^2 \mathbf{e}^{2(r-q\_j)T} \left(\mathbf{e}^{-2\alpha\gamma T} \phi\_{A\_j}\left(-\mathrm{i}2\sigma\_j\sqrt{T}\right) - 1\right) \dots$$

The second and third moment of *Sj*(*T*) are given by:

$$\begin{split} \mathbb{E}\left[S\_{j}(T)^{2}\right] &= \mathbb{E}[S\_{j}(T)]^{2} \frac{\phi\_{A\_{j}}\left(-\mathrm{i}2\sigma\_{j}\sqrt{T}\right)}{\phi\_{A\_{j}}\left(-\mathrm{i}\sigma\_{j}\sqrt{T}\right)^{2}}, \\ \mathbb{E}\left[S\_{j}(T)^{3}\right] &= \mathbb{E}[S\_{j}(T)]^{3} \frac{\phi\_{A\_{j}}\left(-\mathrm{i}3\sigma\_{j}\sqrt{T}\right)}{\phi\_{A\_{j}}\left(-\mathrm{i}\sigma\_{j}\sqrt{T}\right)^{3}}. \end{split}$$

We always assume that these quantities are finite. If the process *Xj* has mother distribution *L*, we can replace φ*Aj* by φ*<sup>L</sup>* in expression (5) and in the formulas for E *Sj*(*T*)<sup>2</sup> and E *Sj*(*T*)<sup>3</sup> . From here on, we always assume that all Lévy processes are built on the same mother distribution. However, all results remain to hold in the more general case.

# **3 A Three-Moments-Matching Approximation**

In order to price a basket option, one has to know the distribution of the random sum *S*(*T*), which is a weighted sum of dependent random variables. This distribution is in most situations unknown or too cumbersome to work with. Therefore, we search for a new random variable which is sufficiently 'close' to the original random variable, but which is more attractive to work with. More concretely, we introduce in this section a new approach for approximating *C*[*K*, *T*] by replacing the sum *S*(*T*) with an appropriate random variable *<sup>S</sup>*(*T*) which has a simpler structure, but for which the first three moments coincide with the first three moments of the original basket *S*(*T*). This moment-matching approach was also considered in Brigo et al. [7] for the multivariate Black and Scholes model.

Consider the Lévy process *Y* = {*Y*(*t*) | 0 ≤ *t* ≤ 1} with infinitely divisible distribution *L*. Furthermore, we define the random variable *A* as

$$A = Y(1).$$

In this case, the characteristic function of *A* is given by φ*L*. The sum*S*(*T*)is a weighted sum of dependent random variables and its cdf is unknown. We approximate the sum *<sup>S</sup>*(*T*) by *<sup>S</sup>*(*T*), defined by

$$
\bar{\tilde{S}}(T) = \bar{\tilde{S}}(T) + \lambda,\tag{7}
$$

where λ ∈ R and

$$\bar{S}(T) = S(0) \exp\left\{ (\bar{\mu} - \bar{\omega})T + \bar{\sigma}\sqrt{T}A \right\}. \tag{8}$$

The parameter μ¯ ∈ R determines the drift and σ >¯ 0 is the volatility parameter. These parameters, as well as the shifting parameter λ, are determined such that the first three moments of *<sup>S</sup>*(*T*) coincide with the corresponding moments of the real basket *S*(*T*). The parameter ω¯, defined as follows

$$
\bar{\boldsymbol{\omega}} = \frac{1}{T} \log \phi\_L \left( -\mathbf{i} \bar{\boldsymbol{\sigma}} \sqrt{T} \right),
$$

is assumed to be finite.

# *3.1 Matching the First Three Moments*

The first three moments of the basket *S*(*T*) are denoted by *m*1, *m*2, and *m*<sup>3</sup> respectively. In the following lemma, we express the moments *m*1, *m*2, and *m*<sup>3</sup> in terms of the characteristic function φ*<sup>L</sup>* and the marginal parameters. A proof of this lemma is provided in the appendix.

**Lemma 1** *Consider the one-factor Lévy model* (6) *with infinitely divisible mother distribution L. The first two moments m*<sup>1</sup> *and m*<sup>2</sup> *of the basket S*(*T*) *can be expressed as follows*

$$m\_1 = \sum\_{j=1}^{n} w\_j \mathbb{E}\left[S\_j(T)\right],\tag{9}$$

$$m\_2 = \sum\_{j=1}^{n} \sum\_{k=1}^{n} w\_j w\_k \mathbb{E}\left[S\_j(T)\right] \mathbb{E}\left[S\_k(T)\right] \left(\frac{\phi\_L\left(-\mathbb{I}(\sigma\_j + \sigma\_k)\sqrt{T}\right)}{\phi\_L\left(-\mathbb{I}\sigma\_j\sqrt{T}\right)\phi\_L\left(-\mathbb{I}\sigma\_k\sqrt{T}\right)}\right)^{\rho\_{j,k}} (10)$$

*where*

$$\rho\_{j,k} = \begin{cases} \rho, \text{ if } j \neq k; \\ 1, \text{ if } j = k. \end{cases}$$

*The third moment m*<sup>3</sup> *of the basket S*(*T*) *is given by*

Basket Option Pricing and Implied Correlation in a One-Factor Lévy Model 343

$$m\_3 = \sum\_{j=1}^{n} \sum\_{k=1}^{n} \sum\_{l=1}^{n} w\_j w\_k w\_l \mathbb{E}\left[S\_j(T)\right] \mathbb{E}\left[S\_k(T)\right] \mathbb{E}\left[S\_l(T)\right]$$

$$\times \frac{\phi\_L\left(-i\left(\sigma\_j + \sigma\_k + \sigma\_l\right)\sqrt{T}\right)^\rho}{\phi\_L\left(-i\sigma\_j\sqrt{T}\right)\phi\_L\left(-i\sigma\_k\sqrt{T}\right)\phi\_L\left(-i\sigma\_l\sqrt{T}\right)} A\_{j,k,l},\tag{11}$$

*where*

$$A\_{j,k,l} = \begin{cases} \left(\phi\_{L}\left(-i\sigma\_{j}\sqrt{T}\right)\phi\_{L}\left(-i\sigma\_{k}\sqrt{T}\right)\phi\_{L}\left(-i\sigma\_{l}\sqrt{T}\right)\right)^{1-\rho}, & \text{if } j \neq k, k \neq l \text{ and } j \neq l;\\ \left(\phi\_{L}\left(-i(\sigma\_{j}+\sigma\_{k})\sqrt{T}\right)\phi\_{L}\left(-i\sigma\_{l}\sqrt{T}\right)\right)^{1-\rho}, & \text{if } j = k, k \neq l;\\ \left(\phi\_{L}\left(-i(\sigma\_{k}+\sigma\_{l})\sqrt{T}\right)\phi\_{L}\left(-i\sigma\_{j}\sqrt{T}\right)\right)^{1-\rho}, & \text{if } j \neq k, k = l;\\ \left(\phi\_{L}\left(-i(\sigma\_{j}+\sigma\_{l})\sqrt{T}\right)\phi\_{L}\left(-i\sigma\_{k}\sqrt{T}\right)\right)^{1-\rho}, & \text{if } j = l, k \neq l;\\ \phi\_{L}\left(-i\left(\sigma\_{j}+\sigma\_{k}+\sigma\_{l}\right)\sqrt{T}\right)^{1-\rho}, & \text{if } j = k = l. \end{cases}$$

In Sect. 2.2 we derived the first three moments for each stock *j*, *j* = 1, 2,..., *n*. Taking into account the similarity between the price *Sj*(*T*) defined in (6) and the approximate r.v. *S*¯(*T*), defined in (8), we can determine the first three moments of *S*¯(*T*):

$$\begin{split} \mathbb{E}\left[\bar{S}(T)\right] &= S(0)\mathbf{e}^{\bar{\mu}T} =: \xi, \\ \mathbb{E}\left[\bar{S}(T)^{2}\right] &= \mathbb{E}\left[\bar{S}(T)\right]^{2} \frac{\phi\_{L}\left(-\mathrm{i}2\bar{\sigma}\sqrt{T}\right)}{\phi\_{L}\left(-\mathrm{i}\bar{\sigma}\sqrt{T}\right)^{2}} =: \xi^{2}\alpha, \\ \mathbb{E}\left[\bar{S}(T)^{3}\right] &= \mathbb{E}\left[\bar{S}(T)\right]^{3} \frac{\phi\_{L}\left(-\mathrm{i}3\bar{\sigma}\sqrt{T}\right)}{\phi\_{L}\left(-\mathrm{i}\bar{\sigma}\sqrt{T}\right)^{3}} =: \xi^{3}\beta. \end{split}$$

These expressions can now be used to determine the first three moments of the approximate r.v. *<sup>S</sup>*(*T*):

$$\begin{split} \mathbb{E}\left[\overline{S}(T)\right] &= \mathbb{E}\left[\overline{S}(T)\right] + \lambda, \\ \mathbb{E}\left[\widetilde{S}(T)^{2}\right] &= \mathbb{E}\left[\bar{S}(T)^{2}\right] + \lambda^{2} + 2\lambda \mathbb{E}\left[\bar{S}(T)\right], \\ \mathbb{E}\left[\widetilde{S}(T)^{3}\right] &= \mathbb{E}\left[\bar{S}(T)^{3}\right] + \lambda^{3} + 3\lambda^{2} \mathbb{E}\left[\bar{S}(T)\right] + 3\lambda \mathbb{E}\left[\bar{S}(T)^{2}\right]. \end{split}$$

Determining the parameters μ¯, σ¯ and the shifting parameter λ by matching the first three moments, results in the following set of equations

$$\begin{aligned} m\_1 &= \xi + \lambda, \\ m\_2 &= \xi^2 \alpha + \lambda^2 + 2\lambda \xi, \\ m\_3 &= \xi^3 \beta + \lambda^3 + 3\lambda^2 \xi + 3\lambda \xi^2 \alpha. \end{aligned}$$

These equations can be recast in the following set of equations

$$\begin{aligned} \lambda &= m\_1 - \xi, \\ \xi^2 &= \frac{m\_2 - m\_1^2}{\alpha - 1}, \\ 0 &= \left(\frac{m\_2 - m\_1^2}{\alpha - 1}\right)^{3/2} (\beta + 2 - 3\alpha) + 3m\_1m\_2 - 2m\_1^3 - m\_3. \end{aligned}$$

Remember that α and β are defined by

$$\alpha = \frac{\phi\_L \left(-\mathrm{i}2\bar{\sigma}\sqrt{T}\right)}{\phi\_L \left(-\mathrm{i}\bar{\sigma}\sqrt{T}\right)^2} \quad \text{and} \quad \beta = \frac{\phi\_L \left(-\mathrm{i}3\bar{\sigma}\sqrt{T}\right)}{\phi\_L \left(-\mathrm{i}\bar{\sigma}\sqrt{T}\right)^3}.$$

Solving the third equation results in the parameter σ¯ . Note that this equation does not always have a solution. This issue was also discussed in Brigo et al. [7] for the Gaussian copula case. However, in our numerical studies we did not encounter any numerical problems. If we know σ¯ , we can also determine ξ and λ from the first two equations. Next, the drift μ¯ can be determined from

$$
\bar{\mu} = \frac{1}{T} \log \frac{\xi}{S(0)}.
$$

# *3.2 Approximate Basket Option Pricing*

The price of a basket option with strike *K* and maturity *T* is denoted by *C*[*K*, *T*]. This unknown price is approximated in this section by *CMM* [*K*, *T*], which is defined as

$$C^{MM}[K,T] = \mathbf{e}^{-rT} \mathbb{E}\left[\left(\widetilde{\mathbf{S}}(T) - K\right)\_+\right].$$

Using expression (7) for *<sup>S</sup>*(*T*), the price *<sup>C</sup>MM* [*K*, *<sup>T</sup>*] can be expressed as

$$C^{MM}[K,T] = \mathbf{e}^{-rT} \mathbb{E}\left[\left(\tilde{\mathbf{S}}(T) - (K-\lambda)\right)\_+\right].$$

Note that the distribution of *S*¯(*T*) is also depending on the choice of λ. In order to determine the price *CMM* [*K*, *T*], we should be able to price an option written on *S*¯(*T*), with a shifted strike *K* − λ. Determining the approximation *CMM* [*K*, *T*] using the Carr–Madan formula requires knowledge about the characteristic function φlog *<sup>S</sup>*¯(*T*) of log *S*¯(*T*):

$$\phi\_{\log \bar{S}(T)}(\mu) = \mathbb{E}\left[\mathbf{e}^{\mathrm{i}\mu \log \bar{S}(T)}\right].$$

Using expression (8) we find that

$$\phi\_{\log \bar{S}(T)}(\mu) = \mathbb{E}\left[\exp\left\{\|\mu \left(\log S(0) + (\bar{\mu} - \bar{\omega})T + \bar{\sigma}\sqrt{T}A\right)\right\}\right].$$

The characteristic function of *A* is φ*L*, from which we find that

$$\phi\_{\log \bar{S}(T)}(\mu) = \exp \left\{ \mathrm{i}\iota \left( \log S(0) + (\bar{\mu} - \bar{\omega})T \right) \right\} \phi\_L \left( \mu \bar{\sigma} \sqrt{T} \right) \dots$$

Note that nowhere in this section we used the assumption that the basket weights *wj* are strictly positive. Therefore, the three-moments-matching approach proposed in this section can also be used to price, e.g. spread options. However, for pricing spread options, alternative methods exist; see e.g. Carmona and Durrleman [11], Hurd and Zhou [24] and Caldana and Fusai [9].

# *3.3 The FFT Method and Basket Option Pricing*

Consider the random variable *X*. In this section we show that if the characteristic function φlog *<sup>X</sup>* of this r.v. *X* is known, one can approximate the discounted stop-loss premium

$$\mathbf{e}^{-rT}\mathbb{E}\left[\left(\boldsymbol{X}-\boldsymbol{K}\right)\_{+}\right],$$

for any *K* > 0.

Let α > 0 and assume that E *X*<sup>α</sup>+<sup>1</sup> exists and is finite. It was proven in Carr and Madan [13] that the price e−*rT*E (*X* − *K*)<sup>+</sup> can be expressed as follows

$$\mathrm{e}^{-rT}\mathbb{E}\left[\left(X-K\right)\_{+}\right] = \frac{\mathrm{e}^{-a\log(K)}}{\pi} \int\_{0}^{+\infty} \exp\left\{-\mathrm{i}\nu\log(K)\right\}\mathrm{g}(\nu)d\nu,\qquad(12)$$

where

$$\log(\nu) = \frac{\mathbf{e}^{-rT}\phi\_{\log X}\left(\nu - (\alpha + 1)\right)}{\alpha^2 + \alpha - \nu^2 + \mathbf{i}(2\alpha + 1)\nu}. \tag{13}$$

The approximation *CMM* [*K*, *T*] was introduced in Sect. 3 and the random variable *<sup>X</sup>* now denotes the moment-matching approximation *<sup>S</sup>*(*T*) <sup>=</sup> *<sup>S</sup>*¯(*T*) <sup>+</sup> <sup>λ</sup>. The approximation *CMM* [*K*, *T*] can then be determined as the option price written on *S*¯(*T*) and with shifted strike price *K* − λ.


**Table 1** Overview of infinitely divisible distributions

# **4 Examples and Numerical Illustrations**

The Gaussian copula model with equicorrelation is a member of our class of onefactor Lévy models. In this section we discuss how to build the Gaussian, Variance Gamma, Normal Inverse Gaussian, and Meixner models. However, the reader is invited to construct one-factor Lévy models based on other Lévy-based distributions; e.g. CGMY, Generalized hyperbolic, etc. distributions.

Table 1 summarizes the Gaussian, Variance Gamma, Normal Inverse Gaussian, and the Meixner distributions, which are all infinitely divisible. In the last row, it is shown how to construct a standardized version for each of these distributions. We assume that *L* is distributed according to one of these standardized distributions. Hence, *L* has zero mean and unit variance. Furthermore, the characteristic function φ*<sup>L</sup>* of *L* is given in closed form. We can then define the Lévy processes *X* and *Xj*, *j* = 1, 2,..., *n* based on the mother distribution *L*. The random variables *Aj*, *j* = 1, 2,..., *n*, are modeled using expression (1).


**Table 2** Basket option prices in the one-factor VG model with *S*1(0) = 40, *S*2(0) = 50, *S*3(0) = 60, *S*4(0) = 70, and ρ = 0

# *4.1 Variance Gamma*

Although pricing basket option under a normality assumption is tractable from a computational point of view, it introduces a high degree of model risk; see e.g. Leoni and Schoutens [28]. The Variance Gamma distribution has already been proposed as a more flexible alternative to the Brownian setting; see e.g. Madan and Seneta [34] and Madan et al. [35].

We consider two numerical examples where *L* has a Variance Gamma distribution with parameters σ = 0.5695, ν = 0.75, θ = −0.9492, μ = 0.9492. Table 2 contains the numerical values for the first illustration, where a four-basket option paying 1 4 <sup>4</sup> *<sup>j</sup>*=<sup>1</sup> *Sj*(*T*) − *K* + at time *T* is considered. We use the following parameter values:*r* = 6 %, *T* = 0.5, ρ = 0 and *S*1(0) = 40, *S*2(0) = 50, *S*3(0) = 60, *S*4(0) = 70. These parameter values are also used in Sect. 5 of Korn and Zeytun [27]. We denote by *Cmc*[*K*, *T*] the corresponding Monte Carlo estimate for the price *C*[*K*, *T*]. Here, 10<sup>7</sup> number of simulations are used. The approximation of the basket option price *C*[*K*, *T*] using the moment-matching approach outlined in Sect. 3 is denoted by *CMM* [*K*, *T*]. A comparison between the empirical density and the approximate density is provided in Fig. 1.

In the second example, we consider the basket *S* (*T*) = *w*1*X*<sup>1</sup> (*T*) + *w*2*X*<sup>2</sup> (*T*), written on two non-dividend paying stocks. We use as parameter values the ones also used in Sect. 7 of Deelstra et al. [18], hence *r* = 5 %, *X*<sup>1</sup> (0) = *X*<sup>2</sup> (0) = 100, and *w*<sup>1</sup> = *w*<sup>2</sup> = 0.5. Table 3 gives numerical values for these basket options. Note that strike prices are expressed in terms of forward moneyness. A basket strike price *K* has forward moneyness equal to *K*/E[*S*] . We can conclude that the threemoments-matching approximation gives acceptable results. For far out-of-the-money call options, the approximation is not always able to closely approximate the real basket option price.

We also investigate the sensitivity with respect to the Variance Gamma parameters σ, ν, and θ and to the correlation parameter ρ. We consider a basket option consisting of 3 stocks, i.e. *n* = 3. From Tables 2 and 3, we observe that the error is the biggest in case we consider different marginal volatilities and the option under consideration is an out-of-the-money basket call. Therefore, we put σ<sup>1</sup> = 0.2, σ<sup>2</sup> = 0.4, σ<sup>3</sup> = 0.6 and we determine the prices *Cmc*[*K*, *T*] and *CMM* [*K*, *T*] for *K* = 105.13. The other parameter values are: *r* = 0.05, ρ = 0.5,*w*<sup>1</sup> = *w*<sup>2</sup> = *w*<sup>3</sup> = 1/3 and *T* = 1. The first panel of Fig. 2 shows the relative error for varying σ. The second panel of Fig. 2 shows the relative error in function of ν. The sensitivity with respect to θ is shown in the third panel of Fig. 2. Finally, the fourth panel of Fig. 2 shows the relative error in function of ρ.

The numerical results show that the approximations do not always manage to closely approximate the true basket option price. Especially when some of the volatilities deviate substantially from the other ones, the accuracy of the approximation deteriorates. The dysfunctioning of the moment-matching approximation in the Gaussian copula model was already reported in Brigo et al. [7]. However, in order to calibrate the Lévy copula model to available option data, the availability of a basket option pricing formula which can be evaluated in a fast way, is of crucial importance. Table 4 shows the CPU times<sup>3</sup> for the one-factor VG model for different basket dimensions. The calculation time of approximate basket option prices when 100 stocks are involved is less than one second. Therefore, the moment-matching approximation is a good candidate for calibrating the one-factor Lévy model.

# *4.2 Pricing Basket Options*

In this subsection we explain how to determine the price of a basket option in a realistic situation where option prices of the components of the basket are available and used to calibrate the marginal parameters. In our example, the basket under consideration consists of 2 major stock market indices (*n* = 2), the S&P500 and the Nasdaq:

$$\mathbf{Basket} = \boldsymbol{w}\_1 \mathbf{S} \& \mathbf{P} \; \mathbf{S} \; \mathbf{0} \mathbf{0} + \boldsymbol{w}\_2 \mathbf{N} \mathbf{s} \mathbf{d} \mathbf{a} \mathbf{q}.$$

The pricing date is February 19, 2009 and we determine prices for the Normal, VG, NIG, and Meixner case. The details of the basket are listed in Table 5. The weights

<sup>3</sup>The numerical illustrations are performed on an Intel Core i7, 2.70 GHz.

**Fig. 1** Probability density function of the real basket (*solid line*) and the approximate basket (*dashed line*). The basket option consists of 4 stocks and *<sup>r</sup>* <sup>=</sup> <sup>0</sup>.06, ρ <sup>=</sup> <sup>0</sup>, *<sup>T</sup>* <sup>=</sup> <sup>1</sup>/2,*w*<sup>1</sup> <sup>=</sup> *<sup>w</sup>*<sup>2</sup> <sup>=</sup> *<sup>w</sup>*<sup>3</sup> <sup>=</sup> *<sup>w</sup>*<sup>4</sup> <sup>=</sup> <sup>1</sup> <sup>4</sup> . All volatility parameters are equal to σ

**Fig. 2** Relative error in the one-factor VG model for the three-moments-matching approximation. The basket option consists of 3 stocks and *r* = 0.05, ρ = 0.5, *T* = 1, σ<sup>1</sup> = 0.2, σ<sup>2</sup> = 0.4, σ<sup>3</sup> = <sup>0</sup>.6,*w*<sup>1</sup> <sup>=</sup> *<sup>w</sup>*<sup>2</sup> <sup>=</sup> *<sup>w</sup>*<sup>3</sup> <sup>=</sup> <sup>1</sup> <sup>3</sup> . The strike price is *K* = 105.13. In the benchmark model, the VG parameters are σ = 0.57, ν = 0.75, θ = −0.95, μ = 0.95

*w*<sup>1</sup> and *w*<sup>2</sup> are chosen such that the initial price *S*(0) of the basket is equal to 100. The maturity of the basket option is equal to 30 days.

The S&P 500 and Nasdaq option curves are denoted by *C*<sup>1</sup> and *C*2, respectively. These option curves are only partially known. The traded strikes for curve *Cj* are denoted by *Ki*,*<sup>j</sup>*, *i* = 1, 2,...,*Nj*, where *Nj* > 1. If the volatilities σ<sup>1</sup> and σ<sup>2</sup>


**Table 3** Basket option prices in the one-factor VG model with *r* = 0.05,*w*<sup>1</sup> = *w*<sup>2</sup> = 0.5, *X*1(0) = *X*2(0) = 100 and σ<sup>1</sup> = σ<sup>2</sup>

and the characteristic function φ*<sup>L</sup>* of the mother distribution *L* are known, we can determine the model price of an option on asset *j* with strike *K* and maturity *T*. This price is denoted by *Cmodel <sup>j</sup>* [*K*, *T*; Θ, σ*j*], where Θ denotes the vector containing the model parameters of *L*. Given the systematic component, the stocks are independent. Therefore, we can use the observed option curves *C*<sup>1</sup> and *C*<sup>2</sup> to calibrate the model parameters as follows:

#### **Algorithm 1** (*Determining the parameters* Θ *and* σ*<sup>j</sup> of the one-factor Lévy model*)

Step 1: Choose a parameter vector Θ. Step 2: For each stock *j* = 1, 2,..., *n*, determine the volatility σ*<sup>j</sup>* as follows:

$$
\sigma\_j = \arg\min\_{\sigma} \frac{1}{N\_j} \sum\_{i=1}^{N\_j} \frac{\left| C\_j^{model}[K\_{i,j}, T; \Theta, \sigma] - C\_j[K\_{i,j}] \right|}{C\_j[K\_{i,j}]},
$$


**Table 4** The CPU time (in seconds) for the one-factor VG model for increasing basket dimension *n*

The following parameters are used: *<sup>r</sup>* <sup>=</sup> <sup>0</sup>.05, *<sup>T</sup>* <sup>=</sup> <sup>1</sup>, ρ <sup>=</sup> <sup>0</sup>.5,*wj* <sup>=</sup> <sup>1</sup> *<sup>n</sup>* , σ*<sup>j</sup>* = 0.4, *qj* = 0, *Sj*(0) = 100, for *j* = 1, 2,..., *n*. The basket strike is *K* = 105.13

**Table 5** Input data for the basket option


Step 3: Determine the total error:

$$\text{error} = \sum\_{j=1}^{n} \frac{1}{N\_j} \sum\_{i=1}^{N\_j} \frac{\left| C\_j^{model}[K\_{i,j}, T; \Theta, \sigma\_j] - C\_j[K\_{i,j}] \right|}{C\_j[K\_{i,j}]}.$$

Repeat these three steps until the parameter vector Θ is found for which the total error is minimal. The corresponding volatilities σ1, σ2,...,σ*<sup>n</sup>* are called the implied Lévy volatilities.

Only a limited number of option quotes is required to calibrate the one-factor Lévy model. Indeed, the parameter vector Θ can be determined using all available option quotes. Additional, one volatility parameter has to be determined for each stock. However, other methodologies for determining Θ exist. For example, one can fix the parameter Θ upfront, as is shown in Sect. 5.2. In such a situation, only one implied Lévy volatility has to be calibrated for each stock.

The calibrated parameters together with the calibration error are listed in Table 6. Note that the relative error in the VG, Meixner, and NIG case is significantly smaller


**Table 6** One-factor Lévy models: Calibrated model parameters

**Table 7** Basket option prices for the basket given in Table 5


The time to maturity is 30 days

than in the normal case. Using the calibrated parameters for the mother distribution *L* together with the volatility parameters σ<sup>1</sup> and σ2, we can determine basket option prices in the different model settings. Note that here and in the sequel of the paper, we always use the three-moments-matching approximation for determining basket option prices. We put *T* = 30 days and consider the cases where the correlation parameter ρ is given by 0.1, 0.5, and 0.8. The corresponding basket option prices are listed in Table 7. One can observe from the table that each model generates a different basket option price, i.e. there is model risk. However, the difference between the

**Fig. 3** Implied market and model volatilities for February 19, 2009 for the S&P 500 (*left*) and the Nasdaq (*right*), with time to maturity 30 days

Gaussian and the non-Gaussian models is much more pronounced than the difference within the non-Gaussian models. We also find that using normally distributed log returns, one underestimates the basket option prices. Indeed, the basket option prices *CV G*[*K*, *T*],*CMeixner*[*K*, *T*] and *CNIG*[*K*, *T*] are larger than *CBLS*[*K*, *T*]. In the next section, however, we encounter situations where the Gaussian basket option price is larger than the corresponding VG price for out-of-the-money options. The reason for this behavior is that marginal log returns in the non-Gaussian situations are negatively skewed, whereas these distributions are symmetric in the Gaussian case. This skewness results in a lower probability of ending in the money for options with a sufficiently large strike (Fig. 3).

# **5 Implied Lévy Correlation**

In Sect. 4.2 we showed how the basket option formulas can be used to obtain basket option prices in the Lévy copula model. The parameter vector Θ describing the mother distribution *L* and the implied Lévy volatility parameters σ*<sup>j</sup>* can be calibrated using the observed vanilla option curves *Cj*[*K*, *T*] of the stocks composing the basket *S*(*T*); see Algorithm 1. In this section we show how an implied Lévy correlation estimate ρ can be obtained if in addition to the vanilla options, market prices for a basket option are also available.

We assume that *S*(*T*) represents the time-*T* price of a stock market index. Examples of such stock market indices are the Dow Jones, S&P 500, EUROSTOXX 50, and so on. Furthermore, options on *S*(*T*) are traded and their prices are observable for a finite number of strikes. In this situation, pricing these index options is not a real issue; we denote the market price of an index option with maturity *T* and strike *K* by *C*[*K*, *T*]. Assume now that the stocks composing the index can be described by the one-factor Lévy model (6). If the parameter vector Θ and the marginal volatility vector σ = (σ1, σ2,...,σ*n*) are determined using Algorithm 1, the model price *Cmodel*[*K*, *T*; σ,Θ, ρ] for the basket option only depends on the choice of the correlation ρ. An *implied correlation* estimate for ρ arises when we match the model price with the observed index option price.

**Definition 1** (*Implied Lévy correlation*) Consider the one-factor Lévy model defined in (6). The *implied Lévy correlation* of the index *S*(*T*) with moneyness π = *S*(*T*)/*S*(0), denoted by ρ [π], is defined by the following equation:

$$\mathcal{C}^{model}\left[K, T; \underline{\sigma}, \mathcal{O}, \rho\left[\pi\right]\right] = \mathcal{C}[K, T], \tag{14}$$

where σ contains the marginal implied volatilities and Θ is the parameter vector of *L*.

Determining an implied correlation estimate ρ [*K*/*S*(0)] requires an inversion of the pricing formula ρ → *Cmodel*[*K*, *T*; σ,Θ, ρ]. However, the basket option price is not given in a closed form and determining this price using Monte Carlo simulation would result in a slow procedure. If we determine *Cmodel*[*K*, *T*; σ,Θ, ρ] using the three-moments-matching approach, implied correlations can be determined in a fast and efficient way. The idea of determining implied correlation estimates based on an approximate basket option pricing formula was already proposed in Chicago Board Options Exchange [15], Cont and Deguest [16], Linders and Schoutens [30], and Linders and Stassen [31].

Note that in case we take *L* to be the standard normal distribution, ρ[π] is an implied Gaussian correlation; see e.g. Chicago Board Options Exchange [15] and Skintzi and Refenes [45]. Equation (14) can be considered as a generalization of the implied Gaussian correlation. Indeed, instead of determining the single correlation parameter in a multivariate model with normal log returns and a Gaussian copula, we can now extend the model to the situation where the log returns follow a Lévy distribution. A similar idea was proposed in Garcia et al. [20] and further studied in Masol and Schoutens [37]. In these papers, Lévy base correlation is defined using CDS and CDO prices.

The proposed methodology for determining implied correlation estimates can also be applied to other multi-asset derivatives. For example, implied correlation estimates can be extracted from traded spread options [46], best-of basket options [19], and quanto options [4]. Implied correlation estimates based on various multiasset products are discussed in Austing [2].

# *5.1 Variance Gamma*

In order to illustrate the proposed methodology for determining implied Lévy correlation estimates, we use the Dow Jones Industrial Average (DJ). The DJ is composed of 30 underlying stocks and for each underlying we have a finite number of option prices to which we can calibrate the parameter vector Θ and the Lévy volatility parameters σ*j*. Using the available vanilla option data for June 20, 2008, we will work out the Gaussian and the Variance Gamma case.<sup>4</sup> Note that options on components of the Dow Jones are of American type. In the sequel, we assume that the American option price is a good proxy for the corresponding European option price. This assumption is justified because we use short term and out-of-the-money options.

The single volatility parameter σ*<sup>j</sup>* is determined for stock *j* by minimizing the relative error between the model and the market vanilla option prices; see Algorithm 1. Assuming a normal distribution for *L*, this volatility parameter is denoted by σ *BLS <sup>j</sup>* , whereas the notation σ *V G <sup>j</sup>* , *j* = 1, 2,..., *n* is used for the VG model. For June 20, 2008, the parameter vector Θ for the VG copula model is given in Table 9 and the implied volatilities are listed in Table 8. Figure 4 shows the model (Gaussian and VG) and market prices for General Electric and IBM, both members of the Dow Jones, based on the implied volatility parameters listed in Table 8. We observe that the Variance Gamma copula model is more suitable in capturing the dynamics of the components of the Dow Jones than the Gaussian copula model.

Given the volatility parameters for the Variance Gamma case and the normal case, listed in Table 8, the implied correlation defined by Eq. (14) can be determined based on the available Dow Jones index options on June 20, 2008. For a given index strike*K*, the moneyness π is defined as π = *K*/*S*(0). The implied Gaussian correlation (also called Black and Scholes correlation) is denoted by ρ*BLS* [π] and the corresponding implied Lévy correlation, based on a VG distribution, is denoted by ρ*V G* [π]. In order to match the vanilla option curves more closely, we take into account the implied volatility smile and use a volatility parameter with moneyness π for each stock *j*, which we denote by σ*j*[π]. For a detailed and step-by-step plan for the calculation of these volatility parameters, we refer to Linders and Schoutens [30].

Figure 5 shows that both the implied Black and Scholes and implied Lévy correlation depend on the moneyness π. However, for low strikes, we observe that ρ*V G* [π] < ρ*BLS* [π], whereas the opposite inequality holds for large strikes, making the implied Lévy correlation curve less steep than its Black and Scholes counterpart. In Linders and Schoutens [30], the authors discuss the shortcomings of the implied Black and Scholes correlation and show that implied Black and Scholes correlations

<sup>4</sup>All data used for calibration are extracted from an internal database of the KU Leuven.


**Table 8** Implied Variance Gamma volatilities σ *V G <sup>j</sup>* and implied Black and Scholes volatilities <sup>σ</sup> *BLS j* for June 20, 2008

can become larger than one for low strike prices. Our more general approach and using the implied Lévy correlation solves this problem at least to some extent. Indeed, the region where the implied correlation stays below 1 is much larger for the flatter implied Lévy correlation curve than for its Black and Scholes counterpart. We also observe that near the at-the-money strikes, VG and Black and Scholes correlation estimates are comparable, which may be a sign that in this region, the use of implied Black and Scholes correlation (as defined in Linders and Schoutens [30]) is justi-


**Table 9** Calibrated VG parameters for different trading days

**Fig. 4** Option prices and implied volatilities (model and market) for Exxon Mobile and IBM on June 20, 2008 based on the parameters listed in Table 8. The time to maturity is 30 days

fied. Figure 7 shows implied correlation curves for March, April, July and August, 2008. In all these situations, the time to maturity is close to 30 days. The calibrated parameters for each trading day are listed in Table 9.

We determine the implied correlation ρ*V G*[π] such that model and market quote for an index option with moneyness π = *K*/*S*(0) coincide. However, the model price is determined using the three-moments-matching approximation and

may deviate from the real model price. Indeed, we determine ρ*V G*[π] such that *CMM K*, *T*; σ,Θ, ρ [π] = *C*[*K*, *T*]. In order to test if the implied correlation estimate obtained is accurate, we determine the model price *Cmc K*, *T*; σ,Θ, ρ [π] using Monte Carlo simulation, where we plug in the volatility parameters and the implied correlation parameters. The results are listed in Table 10 and shown in Fig. 6. We observe that model and market prices are not exactly equal, but the error is still acceptable.

# *5.2 Double Exponential*

In the previous subsection, we showed that the Lévy copula model allows for determining robust implied correlation estimates. However, calibrating this model can be a computational challenging task. Indeed, in case we deal with the Dow Jones Industrial Average, there are 30 underlying stocks and each stock has approximately 5 traded option prices. Calibrating the parameter vector Θ and the volatility parameters σ*<sup>j</sup>* has to be done simultaneously. This contrasts sharply with the Gaussian copula model, where the calibration can be done stock per stock.

In this subsection we consider a model with the computational attractive calibration property of the Gaussian copula model, but without imposing any normality assumption on the marginal log returns. To be more precise, given the convincing arguments exposed in Fig. 7 we would like to keep *L* a *V G*(σ, ν, θ , μ) distribution. However, we do not calibrate the parameter vector Θ = (σ, ν, θ, μ) to the vanilla option curves, but we fix these parameters upfront as follows

$$
\mu = 0, \quad \theta = 0, \quad \nu = 1 \quad \text{and} \quad \sigma = 1.
$$


**Table 10** Market quotes for Dow Jones Index options for different basket strikes on June 20, 2008

For each price we find the corresponding implied correlation and the model price using a one-factor Variance Gamma model with parameters listed in Table 9

**Fig. 6** Dow Jones option prices: Market prices (*circles*) and the model prices using a one-factor Variance Gamma model and the implied VG correlation smile (*crosses*) for June 20, 2008

**Fig. 7** Implied correlation smile for the Dow Jones, based on a Gaussian (*dots*) and a one-factor Variance Gamma model (*crosses*) for different trading days

In this setting, *L* is a standardized distribution and its characteristic function φ*<sup>L</sup>* is given by

$$\phi\_L(\mu) = \frac{1}{1 + \frac{\mu^2}{2}}, \quad \mu \in \mathbb{R}.$$

From its characteristic function, we see that *L* has a *Standard Double Exponential distribution*, also called Laplace distribution, and its pdf *fL* is given by

$$f\_L(\mu) = \frac{\sqrt{2}}{2} \mathbf{e}^{-\frac{\omega}{\sqrt{2}}}$$

The Standard Double Exponential distribution is symmetric and centered around zero, while it has variance 1. Note, however, that it is straightforward to generalize this distribution such that it has center μ and variance σ2. Moreover, the kurtosis of this Double Exponential distribution is 6.

By using the Double Exponential distribution instead of the more general Variance Gamma distribution, some flexibility is lost for modeling the marginals. However, the Double Exponential distribution is still a much better distribution for modeling the stock returns than the normal distribution. Moreover, in this simplified setting, the only parameters to be calibrated are the marginal volatility parameters,

**Fig. 8** Implied correlation smiles in the one-factor Variance Gamma and the Double Exponential model

which we denote by σ *DE <sup>j</sup>* , and the correlation parameter ρ*DE*. Similar to the Gaussian copula model, calibrating the volatility parameter σ *DE <sup>j</sup>* only requires the option curve of stock *j*. As a result, the time to calibrate the Double Exponential copula model is comparable to its Gaussian counterpart and much shorter than the general Variance Gamma copula model.

Consider the DJ on March 25, 2008. The time to maturity is 25 days. We determine the implied marginal volatility parameter for each stock in a one-factor Variance Gamma model and a Double Exponential framework. Given this information, we can determine the prices *CV G*[*K*, *T*] and *CDE*[*K*, *T*] for a basket option in a Variance-Gamma and a Double Exponential model, respectively. Figure 8 shows the implied Variance Gamma and the Double Exponential correlations. We observe that the implied correlation based on a one-factor VG model is larger than its Double Exponential counterpart for a moneyness bigger than one, whereas both implied correlation estimates are relatively close to each other in the other situation.

# **6 Conclusion**

In this paper we introduced a one-factor Lévy model and we proposed a threemoments-matching approximation for pricing basket options. Well-known distributions like the Normal, Variance Gamma, NIG, Meixner, etc., can be used in this one-factor Lévy model. We calibrate these different models to market data and determine basket option prices for the different model settings. Our newly designed (approximate) basket option pricing formula can be used to define implied Lévy correlation. The one-factor Lévy model provides a flexible framework for deriving implied correlation estimates in different model settings. Indeed, by employing a Brownian motion and a Variance Gamma process in our model, we can determine Gaussian and VG-implied correlation estimates, respectively. We observe that the VG implied correlation is an improvement of the Gaussian-implied correlation.

**Acknowledgements** The authors acknowledge the financial support of the Onderzoeksfonds KU Leuven (GOA/13/002: Management of Financial and Actuarial Risks: Modeling, Regulation, Disclosure and Market Effects). Daniël Linders also acknowledges the support of the AXA Research Fund (Measuring and managing herd behavior risk in stock markets). The authors also thank Prof. Jan Dhaene, Prof. Alexander Kukush, the anonymous referees and the editors for helpful comments.

The KPMG Center of Excellence in Risk Management is acknowledged for organizing the conference "Challenges in Derivatives Markets - Fixed Income Modeling, Valuation Adjustments, Risk Management, and Regulation".

# **Appendix: Proof of Lemma 1**

The proof for expression (9) is straightforward.

Starting from the multinomial theorem, we can write the second moment *m*<sup>2</sup> as follows

$$\begin{aligned} m\_2 &= \mathbb{E}\left[ \left( w\_1 \mathcal{S}\_1(T) + w\_2 \mathcal{S}\_2(T) + \dots \mathcal{W}\_n \mathcal{S}\_n(T) \right)^2 \right] \\ &= \mathbb{E}\left[ \sum\_{i\_1 + i\_2 + \dots + i\_n = 2} \frac{2}{i\_1! i\_2! \dots i\_n!} \prod\_{j=1}^n \left( w\_j \mathcal{S}\_j(T) \right)^{i\_j} \right]. \end{aligned}$$

Considering the cases (*in* = 0), (*in* = 1) and (*in* = 2) separately, we find

$$m\_2 = \mathbb{E}\left[\left(\sum\_{j=1}^{n-1} w\_j S\_j(T)\right)^2 + 2\mathbf{w}\_n \mathbf{S}\_n(T) \sum\_{j=1}^{n-1} w\_j \mathbf{S}\_j(T) + w\_j^2 \mathbf{S}\_n^2(T)\right].$$

Continuing recursively gives

$$m\_2 = \sum\_{j=1}^{n} \sum\_{k=1}^{n} w\_j w\_k \mathbb{E}\left[S\_j(T)S\_k(T)\right]. \tag{15}$$

We then find that

$$\begin{split} m\_{2} &= \sum\_{j=1}^{n} \sum\_{k=1}^{n} w\_{j} w\_{k} \mathbf{S}\_{j}(0) \mathbf{S}\_{k}(0) \\ &\times \mathbb{E} \left[ \exp \left\{ (2r - q\_{j} - q\_{k} - \omega\_{j} - \omega\_{k}) T + (\sigma\_{j} \mathbf{A}\_{j} + \sigma\_{k} \mathbf{A}\_{k}) \sqrt{T} \right\} \right] \\ &= \sum\_{j=1}^{n} \sum\_{k=1}^{n} w\_{j} w\_{k} \frac{\mathbb{E} \left[ \mathbf{S}\_{j}(T) \right] \mathbb{E} \left[ \mathbf{S}\_{k}(T) \right]}{\phi\_{\mathcal{L}} \left( - \mathbf{i} \sigma\_{j} \sqrt{T} \right) \phi\_{\mathcal{L}} \left( - \mathbf{i} \sigma\_{k} \sqrt{T} \right)} \mathbb{E} \left[ \exp \left\{ (\sigma\_{j} \mathbf{A}\_{j} + \sigma\_{k} \mathbf{A}\_{k}) \sqrt{T} \right\} \right]. \end{split}$$

In the last step, we used the Expressionω*<sup>j</sup>* = log φ*<sup>L</sup>* iσ*j* √*T* /*T*. If we use expression (1) to decompose *Aj* and *Ak* in the common component *X*(ρ) and the independent components *Xj*(1 − ρ) and *Xk* (1 − ρ), we find the following expression for *m*<sup>2</sup>

$$\mathcal{I}\_{2}m\_{2} = \sum\_{j=1}^{n} \sum\_{k=1}^{n} \boldsymbol{w}\_{j} \boldsymbol{w}\_{k} \frac{\mathbb{E}\left[\mathbf{S}\_{j}(T)\right] \mathbb{E}\left[\mathbf{S}\_{k}(T)\right]}{\phi\_{L}\left(-\mathbb{I}\sigma\_{j}\sqrt{T}\right)\phi\_{L}\left(-\mathbb{I}\sigma\_{k}\sqrt{T}\right)} \mathbb{E}\left[\mathbf{c}^{\{\sigma\_{j}+\sigma\_{k}\}\mathbf{X}(\rho)}\mathbf{c}^{\sigma\_{j}\sqrt{T}\mathbf{X}\_{j}(1-\rho)}\mathbf{c}^{\sigma\_{k}\sqrt{T}\mathbf{X}\_{k}(1-\rho)}\right].$$

The r.v. *X*(ρ) is independent from *Xj*(1 − ρ) and *Xk* (1 − ρ). Furthermore, the characteristic function of *X*(ρ) is φ<sup>ρ</sup> *<sup>L</sup>* , which results in

.

$$\begin{split} m\_{2} &= \sum\_{j=1}^{n} \sum\_{k=1}^{n} \boldsymbol{w}\_{j} \boldsymbol{w}\_{k} \frac{\mathbb{E}\left[\boldsymbol{S}\_{j}(T)\right] \mathbb{E}\left[\boldsymbol{S}\_{k}(T)\right]}{\phi\_{L}\left(-\mathrm{i}\boldsymbol{\sigma}\_{j}\sqrt{T}\right) \phi\_{L}\left(-\mathrm{i}\boldsymbol{\sigma}\_{k}\sqrt{T}\right)} \phi\_{L}\left(-\mathrm{i}(\boldsymbol{\sigma}\_{j} + \boldsymbol{\sigma}\_{k})\sqrt{T}\right)^{\rho} \\ &\times \mathbb{E}\left[\mathbf{e}^{\boldsymbol{\sigma}\_{j}\sqrt{T}\mathbf{X}\_{k}(1-\rho)} \mathbf{e}^{\boldsymbol{\sigma}\_{k}\sqrt{T}\mathbf{X}\_{k}(1-\rho)}\right]. \end{split}$$

If *<sup>j</sup>* = *<sup>k</sup>*, *Xj*(<sup>1</sup> <sup>−</sup> ρ) and *Xk* (<sup>1</sup> <sup>−</sup> ρ) are i.i.d. with characteristic function <sup>φ</sup>1−<sup>ρ</sup> *<sup>L</sup>* , which gives the following expression for *m*2:

$$m\_2 = \sum\_{j=1}^{n} \sum\_{k=1}^{n} w\_j w\_k \mathbb{E}\left[S\_j(T)\right] \mathbb{E}\left[S\_k(T)\right] \left(\frac{\phi\_L\left(-\mathbf{i}(\sigma\_j + \sigma\_k)\sqrt{T}\right)}{\phi\_L\left(-\mathbf{i}\sigma\_j\sqrt{T}\right)\phi\_L\left(-\mathbf{i}\sigma\_k\sqrt{T}\right)}\right)^{\rho}$$

If *j* = *k*, we find that

$$\mathbb{E}\left[\mathbf{e}^{\sigma\_{j}\sqrt{T}X\_{j}(1-\rho)}\mathbf{e}^{\sigma\_{k}\sqrt{T}X\_{k}(1-\rho)}\right] = \phi\_{L}\left(-\mathbf{i}\left(\sigma\_{j}+\sigma\_{k}\right)\sqrt{T}\right),$$

which gives

$$m\_2 = \sum\_{j=1}^n \sum\_{k=1}^n \boldsymbol{w}\_j \boldsymbol{w}\_k \mathbb{E}\left[\mathbf{S}\_j(T)\right] \mathbb{E}\left[\mathbf{S}\_k(T)\right] \frac{\phi\_L\left(-\mathbf{i}(\sigma\_j + \sigma\_k)\sqrt{T}\right)}{\phi\_L\left(-\mathbf{i}\sigma\_j\sqrt{T}\right)\phi\_L\left(-\mathbf{i}\sigma\_k\sqrt{T}\right)}.$$

This proves expression (10) for *m*2.

We can write *m*<sup>3</sup> as follows

$$\begin{aligned} m\_3 &= \mathbb{E}\left[ \left( \sum\_{j=1}^n w\_j S\_j(T) \right)^3 \right] \\ &= \mathbb{E}\left[ \left( \sum\_{j=1}^n w\_j S\_j(T) \right)^2 \sum\_{l=1}^n w\_l S\_l(T) \right]. \end{aligned}$$

Using expression (15), we find the following Expression for *m*3:

$$\begin{aligned} m\_3 &= \mathbb{E}\left[ \left( \sum\_{j=1}^n \sum\_{k=1}^n w\_j w\_k S\_j(T) S\_k(t) \right) \sum\_{l=1}^n w\_l S\_l(T) \right] \\ &= \sum\_{j=1}^n \sum\_{k=1}^n \sum\_{l=1}^n w\_j w\_k w\_l \mathbb{E}\left[ S\_j(T) S\_k(T) S\_l(T) \right]. \end{aligned}$$

Similar calculations as for *m*<sup>2</sup> result in

$$\begin{split} m\_{3} &= \sum\_{j=1}^{n} \sum\_{k=1}^{n} \sum\_{l=1}^{n} w\_{j} w\_{k} w\_{l} \mathbb{E} \left[ S\_{j}(T) \right] \mathbb{E} \left[ S\_{k}(T) \right] \mathbb{E} \left[ S\_{l}(T) \right] \\ &\times \frac{\phi\_{L} \left( -\mathbf{i} (\sigma\_{j} + \sigma\_{k} + \sigma\_{l}) \sqrt{T} \right)^{\rho}}{\phi\_{L} \left( -\mathbf{i} \sigma\_{j} \sqrt{T} \right) \phi\_{L} \left( -\mathbf{i} \sigma\_{k} \sqrt{T} \right) \phi\_{L} \left( -\mathbf{i} \sigma\_{l} \sqrt{T} \right)} A\_{j,k,l} . \end{split}$$

where

$$A\_{j,k,l} = \mathbb{E}\left[\mathbf{e}^{\sigma\_l\sqrt{T}X\_j(1-\rho)}\mathbf{e}^{\sigma\_k\sqrt{T}X\_k(1-\rho)}\mathbf{e}^{\sigma\_l\sqrt{T}X\_l(1-\rho)}\right].$$

Differentiating between the situations (*j* = *k* = *l*), (*j* = *k*, *k* = *l*), (*j* = *k*, *k* = *l*), (*j* = *k*, *k* = *l*, *j* = *l*) and (*j* = *k* = *l*, *j* = *l*), we find expression (11).

**Open Access** This chapter is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, duplication, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, a link is provided to the Creative Commons license and any changes made are indicated.

The images or other third party material in this chapter are included in the work's Creative Commons license, unless indicated otherwise in the credit line; if such material is not included in the work's Creative Commons license and the respective action is not permitted by statutory regulation, users will need to obtain permission from the license holder to duplicate, adapt or reproduce the material.

# **References**


# **Pricing Shared-Loss Hedge Fund Fee Structures**

#### **Ben Djerroud, David Saunders, Luis Seco and Mohammad Shakourifar**

**Abstract** The asset management business is driven by fee structures. In the context of hedge funds, fees have usually been a hybrid combination of two different types, which has coined a well-known business term of "2 and 20". As an attempt to provide better alignment with their investors, in a new context of low interest rates and lukewarm performance, a new type of fund fees has been introduced in the last few years that offers a more symmetric payment structure, which we will refer to as *shared loss*. In this framework, in return for receiving performance fees, the fund manager provides some downside protection against losses to the investors. We show that the position values of the investor and the hedge fund manager can be formulated as portfolios of options, and discuss issues regarding pricing and fairness of the fee rates, and incentives for both investors and hedge fund managers. In particular, we will be able to show that, from a present value perspective, these fee structures can be set up as being favorable either to the hedge fund manager or to the investor. The paper is based on an arbitrage-free pricing framework. However, if one is to take

B. Djerroud · M. Shakourifar (B) Sigma Analysis & Management, Toronto, ON, Canada e-mail: mohammad@sigmanalysis.com

B. Djerroud e-mail: ben\_d@sigmanalysis.com

D. Saunders Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, Canada e-mail: dsaunders@uwaterloo.ca

L. Seco Department of Mathematics, University of Toronto, Toronto, Canada e-mail: seco@math.utoronto.ca

This research was supported in part by the Natural Sciences and Engineering Research Council of Canada.

into account the value to the business that investor capital brings to a fund, which is not part of our framework, it is possible to create a situation where both investors as well as asset managers win.

**Keywords** Hedge funds · Fee structures · First-loss · Shared-loss · Black-Scholes option pricing

# **1 Introduction**

Hedge Funds are pooled investment vehicles overseen by a management company. They generally aim at absolute return portfolios and their success is usually linked to market inefficiencies, such as instrument mispricing, misguided market consensus or, in general terms, the manager's intelligence to anticipate market moves. The nature of these investments is that they exploit investment opportunities that are rare. This is a characteristic that they share with private equity investments, but they share with the mutual fund industry the fact that they often trade in liquid, marketable securities. Fund sizes are more in line with private equity investing than with the mammoth mutual fund industry. Their compensation structure, because of their limited access to opportunity, is also more in line with the private equity universe, and usually consists in a fixed, asset-based fee, and a variable, performance fee base. Because of market conditions that have been in place over the last several years, in particular the low interest rate environment, coupled with the lukewarm performance of the hedge fund sector in the recent years, investors have become increasingly more sensitive to fee structures. The traditional 2&20 fee structure, consisting of a flat fee of 2% of assets under management together with a performance fee of 20% of net profits is considered unfair on the basis of the asymmetry: the management company will always earn a fee, whereas the investor is only guaranteed to pay that fee. The advent of the 40-ACT funds<sup>1</sup> has, in particular, dispensed with the performance fee base in favor of a fixed management fee, which is more in line with the mutual fund industry than with the hedge fund industry. This compensation model essentially rewards funds for becoming asset gatherers instead of the alpha-seeking business the hedge fund was set out to be. In this paper we will examine, from a quantitative perspective, a suite of symmetric performance fee structures which are gaining traction with more sophisticated investors, known as first-loss (or shared-loss) fee structures. In this new framework, in return for receiving performance fees, the fund manager provides some downside protection against losses to the investors.

The issue of the incentives created by hedge fund fees bears much similarity to issues surrounding the structure of executive compensation. At first glance, the optionality inherent in both would seem to incentivize greater risk taking. However, the reality is more subtle. Carpenter [2] studies the case of executive compensation,

<sup>1</sup>Pooled investment vehicles, enforced and regulated by the Securities and Exchange Commission, that are packaged and sold to retail and institutional investors in the public markets.

when the manager cannot hedge options provided as compensation by trading the underlying. In certain conditions, a utility-maximizing manager may choose to reduce rather than increase the volatility of the underlying firm. Ross [9] gives necessary and sufficient conditions for a fee schedule to make a utility-maximizing manager more or less risk-averse. Hodder and Jackwerth [6] consider the effects of hedge fund fee incentives on a risk manager with power utility, and also in the presence of a liquidation barrier. They find that over a one-year horizon, risk-taking varies dramatically with fund value, but that this effect is moderated over longer time horizons. Kouwenberg and Ziemba [7] consider loss-averse hedge fund managers and find that higher incentive fees lead to riskier fund management strategies. However, this effect is reduced if a significant portion of the manager's own money is invested in the fund. They further provide empirical evidence showing that hedge funds with incentive fees have significantly lower mean returns (net of fees), and find a positive correlation between fee levels and downside risk. They find that risk is increasing with respect to the performance fee if the manager's objective function is based on cumulative prospect theory, rather than utility, and provide empirical evidence. Recent work on the analysis of hedge fund fee structures includes that of Goetzmann et al. [3], who value a fee structure with a highwater mark provision, using a PDE approach with a fixed investment portfolio, Panageas and Westerfield [8], who consider the portfolio selection decision of maximizing the present value of fees for a risk-neutral manager over an infinite horizon, and Guasoni and Obłój [4], who extend this work to managers with risk-averse power utility. Closest to the current work is He and Kou [5], who analyze shared-loss fee structures for hedge funds by looking at the portfolio selection decision of a hedge fund manager whose preferences are modeled using cumulative prospect theory. The problem is considered in the presence of a manager investing in the fund, and with a predetermined liquidation barrier. Analytical solutions of the portfolio selection problem are provided, and the result (cumulative prospect theory) for both the investor and the manager is examined. It is found that depending on the parameter values, either a traditional fee structure or a first-loss fee structure may result in a riskier investment strategy. While for some parameter values, the first-loss structure improves the utility of both the investor and the hedge fund manager, they find that for typical values, the manager is better off, while the investor is worse off. In this paper, we investigate the shared-loss fee structures from the perspective of risk-neutral valuation, with no further assumptions about investor preferences, while He and Kou [5] solve the stochastic control problem (under the real-world measure) corresponding to the manager maximizing the utility function from cumulative prospect theory, and also evaluate the investor's payoff using the same type of criterion.

The paper is organized as follows. First, we will review the traditional fee structures in some detail. Next, we will introduce the notion and mechanics of the first-loss structures, and a framework for a fee pricing based on the theory of option price valuation. After that, we will introduce the concept of net fee, a number that will allow us to determine whether the investor or the management company is the net winner in a given fee agreement. Finally, we will present a set of computational examples that will display the net fee as a function of the agreement and market variables.

# **2 Hedge Fund Fees**

The hedge fund manager charges two types of fees to the fund investors:


In this paper we assume a single investor and a single share issued by the fund. The extension to the case of multiple investors and multiple shares is straightforward. Although fees are paid according to a determined schedule (usually monthly or quarterly for management fees and annual for performance fees), we will assume a single payment at the end of a fixed term *T* .

The fund value evolution and fee payment mechanics are denoted as follows: the initial fund supplied by the investor is *X*0. The hedge fund manager then invests fund assets to create future gross values *Xt* , for *t* > 0. The gross fund value *Xt* is split between the investor's worth *It* (the net asset value) and the manager's fee *Mt* :

$$X\_t = I\_t + M\_t.$$

At time 0, *X*<sup>0</sup> = *I*<sup>0</sup> and *M*<sup>0</sup> = 0.

There are countless variations to this basic framework, including hurdles, clawbacks, etc. (for more details on first-loss arrangements see Banzaca [1]). We will ignore those and assume the commonly used version of a management fee equal to *m* · *X*<sup>0</sup> (*m* represents a fixed percentage of the initial investment by the investor), and a performance fee of

$$
\alpha \cdot (X\_T - (1+m)X\_0))\_+, 
$$

payable only when it is positive, and equal to zero when it is negative. Hence,

$$M\_T = m \cdot X\_0 + \alpha \cdot (X\_T - (1+m)X\_0)\_+ \tag{1}$$

In other words, while the management fee is a fixed future liability to the investor, the performance fee is a contingent claim on the part of the manager. As a consequence, we will be pricing the management fee simply as a fixed guaranteed fee with a predetermined future cash value, and we will be valuing the performance fee as the value of a certain call option. In our setting, we will assume normally distributed log-returns for the invested assets *Xt* , which allows us to value the performance fee in the Black–Scholes framework. It is worth mentioning that hedge funds managers can speculate on volatility, credit risks, etc. and in contrast to the traditional money managers, they can go long and short. The diversity in investment styles and the different levels of gross and net exposure that they can employ could result in leptokurtic (nonnormal) properties in their returns, which is revealed through frequent large negative returns to the left of the return distribution. Generalization of the current framework to models that account for non-normality of the hedge fund returns, for example by employing generalized autoregressive conditional heteroskedasticity (GARCH) models, could be a subject for future research.

# **3 The First-Loss Model**

Calpers announced in 2014 that they were exiting hedge fund investments WSJ [10]. While not the main stated reason for their decision, one they mentioned was high fees payable to their hedge fund managers, something that has caught the attention of investors worldwide in the contemporary context of a widely accepted notion that hedge fund fees nowadays are too high. Certain hedge funds are reacting to this shifting balance of power between the sell-side and the buy-side of the investment business with the creation of innovative fee structures which still reward the intellectual capital of the hedge fund manager and allow for business growth but at the same time offer the investor a more symmetric compensation structure.

An example of a first-loss structure is the following:


In our paper we will present a quantitative comparison of the fees payable to the manager and the risk-neutral valuation of the guarantee offered to the investor. We want to note, for the sake of completeness, that there are many other qualitative considerations which are relevant when analyzing both the fee structure as well as the business value offered to a management company by the investor, which are not the objective of this paper. In fact, hedge fund start-ups have become more difficult in recent times, increasing value to any investor action that allows a hedge fund business to succeed. That value is linked to a wide variety of fund characteristics, including the size of assets under management (AUM), the track record, or historical performance, and the reputation of its investor base, among others.

In addition to the initial investment *X*0, the management fee*m* and the performance fee α, payable at a fixed time horizon *T* , we will now also consider a deposit amount *c*, as a percentage of the initial investment *X*0, which the manager will provide as a guarantee for losses. Our objective is to analyze the relationship between all four variables to determine whether the investor, or the manager, is the net winner of value-add from a risk-neutral valuation perspective.

# **4 An Option Pricing Framework**

The fund value *Xt* is split between the investors *It* and the manager *Mt* , where, *Xt* = *It* + *Mt* . In the following sub-sections we derive the payoff function of each player separately, and then price the positions accordingly.

# *4.1 Payoff to the Investor*

The payoff to the investor at the terminal time *T* is:

$$I\_T = \begin{cases} X\_T - mX\_0 - \alpha (X\_T - mX\_0 - X\_0) & \text{when } X\_T - mX\_0 \ge X\_0 \\ X\_0 & \text{when } (1 - c)X\_0 \le X\_T - mX\_0 \le X\_0 \\ X\_T + (c - m)X\_0 & \text{when } X\_T - mX\_0 \le (1 - c)X\_0 \end{cases}$$

or, writing the payoff in a more compact form:

$$\begin{array}{cc} I\_T = & X\_T - mX\_0 & \text{(pays a management fee)}\\ & -\alpha (X\_T - mX\_0 - X\_0)\_+ & \text{(pays a performance fee)}\\ + (X\_0 - X\_T + mX\_0)\_+ - ((1 - c)X\_0 - X\_T + mX\_0)\_+ & \text{(rececewise a guarantee)} \end{array}$$

Thus, we see that the position of the investor is equivalent to the following portfolio:


# *4.2 Payoff to the Manager*

The payoff to the manager is *MT* = *XT* − *IT* . In other words, the payoff to the hedge fund manager results from the manager having the opposite position in all of the options of the investor. More explicitly,

Pricing Shared-Loss Hedge Fund Fee Structures 375

$$\begin{array}{cc} M\_T = & mX\_0 & \text{(receive a management fee)}\\ & +\alpha (X\_T - mX\_0 - X\_0)\_+ & \text{(receive a performance fee)}\\ & -(X\_0 - X\_T + mX\_0)\_+ + ((1-c)X\_0 - X\_T + mX\_0)\_+ & \text{(provides a guarantee)} \end{array}$$

which implies that the hedge fund manager has a portfolio of options consisting of:


Note that net income to the management company is now no longer guaranteed to be positive. In addition, since the options trades constitute a zero-sum game (the positions of the manager and the investor are opposite each other), the sum of the investor payoff and the manager payoff is equal to *XT* .

# *4.3 Valuation: Pricing Fees as Derivatives*

In this section, we will value the positions of the investor and the hedge fund manager using a simple Black–Scholes model for the underlying fund value process. In particular, we employ risk-neutral valuation, and assume that under the risk-neutral probabilities, the fund value process satisfies the stochastic differential equation:

$$dX\_t = rX\_t \, dt \, \, + \, \sigma \, X\_t \, dW\_t,\tag{2}$$

with solution:

$$X\_t = X\_0 \exp\left( (r - \frac{\sigma^2}{2})t + \sigma W\_t \right) \tag{3}$$

where *Wt* is a standard Brownian motion, and *r* and σ are positive constants, giving the continuously compounded risk-free interest rate and the volatility of the hedge fund assets respectively. It should be noted that the Black–Scholes framework is applicable to our context as the underlying, that is the fund value, can be dynamically traded. Moreover, in a managed account context, even the liquidity of the fund can be made to match the liquidity of the underlying traded securities.

The Black–Scholes formula can be used to derive the price of the investor's position under the Black–Scholes model:

$$\begin{aligned} V\_I(0) &= X\_0 - e^{-rT} m X\_0 - \alpha C(X\_0, T, X\_0 + m X\_0, r, \sigma) \\ &+ P(X\_0, T, X\_0 + m X\_0, r, \sigma) - P(X\_0, T, (1 - c) X\_0 + m X\_0, r, \sigma) \end{aligned} (4)$$

where *C*(*X*, *T*, *K*,*r*,σ) is the Black–Scholes price of a call option on a non-dividend paying asset with current value of the underlying *X*, time to expiration *T* , strike price

*K*, risk-free interest rate *r* and volatility σ, and *P*(*X*, *T*, *K*,*r*,σ)is the Black–Scholes put option price with the same parameters as arguments.

# **5 Consequences of the Derivative Pricing Framework**

# *5.1 Graphical Analysis*

To compare and contrast the traditional and shared-loss fee structures, in our base case we take the investment horizon to be one month, that is *T* = 1/12, the performance fee α = 50 %, the manager deposit *c* = 10 %, the risk-free interest rate *r* = 2 %, the volatility σ = 15 %, and the initial investment *X*<sup>0</sup> = \$1. For simplicity and without loss of generality we assume a *zero management fee* for our base case.

With our base case parameters, the total value of the investor's payoff is 1.0073, and the value of the manager's payoff is −0.0073. Notice that the value of the investor's payoff is greater than the initial investment of 1. In contrast, the price of the traditional investor payoff (without the insurance part of the payoff—i.e. removing both put options) is 0.9909, and the value of the manager's payoff in this instance is 0.0091.

#### **5.1.1 Payoff Functions of the Investor and the Manager**

The payoff functions of the investor under the shared-loss and the traditional fee structures are given in Fig. 1. The payoff to the hedge fund manager using the aforementioned benchmark values and under the shared-loss fee structure is also depicted in Fig. 2 along with the traditional payoff structure with only the performance fee α(*XT* − *X*0)+. Observe that since the options trades constitute a zero-sum game (the positions of the manager and the investor are opposite each other), the sum of the investor payoff and the manager payoff is equal to *XT* .

Figure 3 illustrates the 'fair performance fee', where investor gets a payoff with present value equal to his initial cash injection, *X*0, given volatility and manager's deposit levels, i.e. we set *VI*(0) = *X*0. The fair performance fee can be easily obtained from Eq. (4) as,

$$\alpha\_{\text{fair}} = \frac{-e^{-rT}mX\_0 + P(X\_0, T, X\_0 + mX\_0, r, \sigma) - P(X\_0, T, (1 - c)X\_0 + mX\_0, r, \sigma)}{C(X\_0, T, X\_0 + mX\_0, r, \sigma)}$$

Interested reader can derive explicit, well-known expressions for the sensitivities of the αfair relative to different parameters in terms of the Greeks and Vega of the involving options. As can be seen from the figure, for small values of volatility, the fair performance fee is indifferent to the levels of manager's deposit; however, as volatility increases, a higher level of deposit by the manager translates into a higher

**Fig. 1** Payoff for the hedge fund investor

**Fig. 2** Payoff for the hedge fund manager

performance fee paid by the investor to make the deal a fair one. In Fig. 4, we normalize the volatility on the horizontal axis by the manager's deposit defined *as a percentage of the initial investment X*0. For a given level of deposit, the higher the volatility of the underlying investment, the higher the probability that the loss incurred by the manager exceeds the deposit. In other words, the probability that the manager

**Fig. 3** Fair performance fee versus volatility

**Fig. 4** Fair performance fee versus normalized (by deposit) volatility

exercises the put option offered by the investor increases, which results in a reversal in the fair performance fee for higher levels of volatility. This is clearly illustrated in Fig. 4 where volatility and deposit are combined in a single scaling variable, that is, volatility/deposit, *where the deposit is expressed as a percentage of the initial investment X*0. The corresponding maximum value for the fair performance fee increases with the size of the deposit; that's because for higher deposits, the manager will have to lose more and more before the investor starts bearing the residual loss, therefore his compensation should be higher accordingly. Note that the x-axis in Figs. 3 and 4 is incorporating the annual volatility of the fund assets; however, the performance fee is crystallized on a monthly basis which suggests a comparison between the deposit level and *monthly* volatility, as opposed to annual volatility. Since returns are assumed to follow a normal distribution in our Black–Scholes framework, one can explicitly calculate the probability of the returns falling into a certain interval, in particular, with about 68% probability, the return falls within 1 standard deviation of the mean. This explains why the curves for various deposits reach a maximum roughly around the same level of (annual) volatility/deposit ratio, in the [1, 2] interval.

# *5.2 Sensitivity Analysis*

In this section, we perform a sensitivity analysis of the prices of the investor's and manager's payouts, as a function of the different model parameters.

#### **5.2.1 Volatility (***σ***)**

Figure 5 shows the value of the investor's position as a function of the volatility parameter σ, as σ ranges from 5 % to 60 %.

**Fig. 5** Value of the investor's position versus volatility σ

We see that the position is initially an increasing function of the volatility, owing to the increasing value of the investor's put option as a function of σ. However, as the volatility becomes very large, the value of the investor's position starts to decline as the hedge fund's call option, as well as its put option, become more valuable. The maximum value for the investor occurs at a volatility around σ = 32.5 %. Observe however, that the value is relatively insensitive to the level of σ, with a minimum value of 1.0016, and a maximum value of 1.0118.

#### **5.2.2 Manager Deposit (***c***)**

We varied the manager deposit between 1 % and 25 %, while holding all other parameters at their base case values. The results of the sensitivity analysis are shown in Fig. 6.

As would be expected, the value of the investor's position is an increasing function of the manager's deposit. The value of the position is equal to one (break-even point, or 'fair fee point') at around *c* = 0.0233. Any deposit level less than *c* = 0.0233 puts the investor at a disadvantage, and the investor is indifferent to deposit levels higher than 10 %.

**Fig. 6** Value of the investor's position versus manager's deposit *c*

**Fig. 7** Value of the investor's position versus the expiration date *T*

#### **5.2.3 Maturity Date (***T***)**

The dependence on the time to maturity is of interest specially when adapting the results of this paper to realistic situations. As we mentioned earlier, our mathematical assumption is that fees will be paid at a fixed time in the future. In practice, fees are payable according to calendars agreed between the investors and the manager. In the graphs that follow, we address this by varying the expiration date *T* from 1 day to 1 year. The results are shown in Fig. 7.

Initially, the value of the position is increasing in *T* , but eventually, it begins to decrease in *T* , as the options given to the hedge fund manager become more valuable. The maximum value of the investor's position occurs at *T* around one quarter of a year (*T* ∼ 0.22).

# **6 Conclusion**

The exchange of business value between the manager and the investor is always a complex one: beyond fees paid, there are intangibles the investor gives the manager. An asset management business is valued taking into account many factors, such as track records, years in business, assets under management, the reputation of its investors, and of course fees. In this paper we focus on first-loss fee structures, which are bringing novel points of attention between investors and hedge fund managers in the historical discussions on fair compensation. We focus only on the fee payable by the investor and the guarantee offered by the manager, which is the main novelty in this set up. The main challenge in this new paradigm is to evaluate the value of the guarantee offered by the hedge fund manager in relation to the fee paid by the investor. In this paper, we developed a mathematical approach to compare the two features of guarantee and performance fee from an option pricing perspective. The framework is flexible and can be used for different specific investment settings and can account for slight variations from one fund to another. Our salient leitmotif is: fee agreements must be structured to be attractive to managers so they are willing to participate, and at the same time provide a cushion against losses to the investor. A significant contribution, that sheds light on the road-map and paves the way for deeper investigations, is to see, and more importantly formulate, the underlying fee structure from the lens of option valuation. By employing a risk-neutral framework and options pricing theory, one is able to not only price, but also analyze the sensitivity of the value of the investor's and manager's positions in reference to a set of influential parameters.

**Acknowledgements** We wish to express our gratitude to Sigma Analysis & Management Ltd., and especially to Dr. Ranjan Bhaduri and Mr. Kurt Henry for many endless valuable discussions.

The KPMG Center of Excellence in Risk Management is acknowledged for organizing the conference "Challenges in Derivatives Markets - Fixed Income Modeling, Valuation Adjustments, Risk Management, and Regulation".

**Open Access** This chapter is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, duplication, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, a link is provided to the Creative Commons license and any changes made are indicated.

The images or other third party material in this chapter are included in the work's Creative Commons license, unless indicated otherwise in the credit line; if such material is not included in the work's Creative Commons license and the respective action is not permitted by statutory regulation, users will need to obtain permission from the license holder to duplicate, adapt or reproduce the material.

# **References**


# **Negative Basis Measurement: Finding the Holy Scale**

**German Bernhart and Jan-Frederik Mai**

**Abstract** Investing into a bond and at the same time buying CDS protection on the same bond is known as buying a basis package. Loosely speaking, if the bond pays more than the CDS protection costs, the position has an allegedly risk-free positive payoff known as "negative basis". However, several different mathematical definitions of the negative basis are present in the literature. The present article introduces an innovative measurement, which is demonstrated to fit better into arbitrage pricing theory than existing approaches. This topic is not only interesting for negative basis investors. It also affects derivative pricing in general, since the negative basis might act as a liquidity spread that contributes as a net funding cost to the value of a transaction; see Morini and Parampolini (Risk, 58–63, 2011, [23]).

**Keywords** Negative basis measurement · Bond-CDS basis · Hidden yield

# **1 Introduction**

On first glimpse, it is surprising that investing into a bond and buying CDS protection on that underlying bond, henceforth called a basis package, can earn an attractive spread on top of the risk-free rate of return, as it appears to be free of default risk. This excess return over the risk-free rate is informally called *negative basis*1; more formal definitions are given in the main body of this article. [8] has even devoted an entire book to the topic. If, conversely, the cost of CDS protection exceeds the bond earnings, one speaks of a positive basis. In this article, we only speak of negative bases, as fundamentally the concepts of positive and negative basis are simply inverse.

G. Bernhart · J.-F. Mai (B)

XAIA Investment, Sonnenstraße 19, 80331 München, Germany e-mail: jan-frederik.mai@xaia.com

G. Bernhart e-mail: german.bernhart@xaia.com

<sup>1</sup>Sometimes also called *bond-CDS basis*.;

The appropriate measurement of negative basis plays an important role with regard to the cost of funding literature, which has become of paramount interest in the financial industry since the recent liquidity crisis. Generally speaking, this stream of literature reconsiders the pricing of derivatives under the new post-crisis fundamentals regarding funding, liquidity, and credit risk issues. Substantial contributions have been made, among others, by [5, 7, 12, 13, 23, 27, 29]. Loosely speaking, most references agree upon the fact that, at least under certain simplifying assumptions (full, bilateral, and continuous collateralization), derivative contracts can be evaluated in the traditional way, only the involved discount factors have to be adjusted by means of a spread accounting for funding and liquidity charges. In particular, [23] show in a simple, theoretical framework that the negative basis is a spread which plays an essential role in this regard. In order to set these theoretical findings into action in the industry's pricing machinery, it is therefore an essential task to establish viable and reasonable measurements for the negative basis. The present article shows that this topic is not only important but also challenging, and contributes a careful comparison of three different measurement methods. In particular, we point out why the most common measurement approaches (denoted by (Z) and (PE) below) are not recommended, and propose a decent alternative.

In the present article, we take the point of view of a negative basis investor whose goal is to detect interesting negative basis positions and to monitor the evolution of such investments over time. Alternatively, consider a bank which has to evaluate its derivative book. As the aforementioned references show that the required discount factors for the pricing algorithms might have to be adjusted by means of the negative basis, one faces the task of measuring this negative basis appropriately. For the effective implementation of these tasks, it is crucial to come up with a reliable and viable, yet reasonable mathematical definition of what the negative basis actually is. Specific focus is put on simple-to-implement approaches that rely on commonly applied pricing methodologies for bonds and CDS, described in, e.g., [18, 25]. In total, we discuss three different measurements (two traditional and one innovative):


Important to note is that, according to all these definitions, a negative basis is assigned to a bond, not to an issuer. This means that two different bonds issued by the same company are allowed to have two different negative bases. This viewpoint stands in glaring contrast to some of the more macro-economic considerations carried out in references cited in the next section. CDS protection typically refers to a whole battery of eligible bonds by a reference issuer, and normally the major driver for CDS spreads is considered to be the issuer's default risk. However, some of the deliverable bonds might trade at diverse yields for reasons other than the issuer's default risk—for instance legal issues, liquidity issues, or funding issues, cf. [21] and Sect. 2.

The rest of this article is organized as follows. Section 2 recalls reasons for the existence of negative basis. Section 3 introduces general notations, which are used throughout the remaining sections. Section 4 reviews the traditional methods (Z) and (PE), Sect. 5 discusses the innovative method (HY), and Sect. 6 concludes.

# **2 Why Does Negative Basis Exist?**

There are a couple of intuitive explanations for the existence of negative basis, see, e.g., [1, 2, 4, 6, 10, 19, 24, 26, 30]. For the convenience of the reader, we briefly recall some of them in the sequel.


<sup>2</sup>However, counterparty credit risk can be reduced significantly by a negative basis investor when the CDS is collateralized, which is the usual case.

loses money due to mark-to-market balancing. In theory, one gets this money back eventually, but it might occur that mark-to-market losses exceed one's personal tolerance level during the bond's lifetime. In this case, one has to exit the position and realize the loss. This risk is especially significant if the negative basis position is levered (which has happened heavily during the financial crisis). Part of the negative basis might be viewed as a risk premium for taking this mark-to-marketrisk.

Basis "arbitrageurs" are investors that try to earn the negative basis by investing into basis packages. This means that they consider the negative basis an adequate compensation for taking the aforementioned risks. In classical arbitrage theory, their appearance improves trading liquidity. Counterintuitively, however, [9] argue that the advent of CDS was detrimental to bond markets and [20] find some evidence that basis arbitrageurs bring new risks into the corporate bond markets.

# **3 General Notations**

All definitions to follow rely on the pricing of CDS and a plain vanilla coupon bond according to the most simple mathematical setup we can think of. This is in order to make the article as reader-friendly as possible; furthermore, we think the setup is already rich enough in order to convey the main ideas. The only randomness considered in the present article is the default time of the bond issuer, which is formally defined on a probability space (Ω, *F*, Q), with state space Ω, σ-algebra *F*, and probability measureQ. Expected values with respect to the pricing measureQ are denoted byE. The default intensity λ(.) of the issuer's default time τ is assumed to be deterministic, i.e. Q(τ > *t*) = exp(− *t* <sup>0</sup> λ(*s*) d*s*). Sometimes the function λ(.) is constant, sometimes piecewise constant, depending on our application. For example, the computation of a so-called Z-spread requires λ(.) to be constant,3 whereas the joint consistent pricing of several CDS quotes with different maturities requires λ(.) to be piecewise constant.

Generally speaking, it is our understanding that a negative basis is a measure for the mispricing between CDS and bonds with respect to default risk alone. This explains why considering the default time as the sole stochastic object corresponds to the most minimal modeling approach possible. Besides the non-randomness of the default intensity, the following further simplifying assumptions are taken for granted throughout:

• We ignore recovery risk: Upon default, the bond holder receives the constant proportion *R* ∈ [0, 1] of her nominal. Default is assumed to instantaneously trigger a credit event of the CDS. The bond is assumed to be a deliverable security in the auction following the CDS trigger event, and the auction process is assumed to yield the same recovery rate *R*. Although this is an unrealistic assumption in

<sup>3</sup>See below in Step 3 of Definition 1.

principle (see, e.g., [17]), a negative basis investor can always eliminate recovery risk by delivering his bonds into the auction (physical settlement), in which case he gets compensated by the (nominal-matched) CDS for the nominal loss of the bond.<sup>4</sup> Consequently, our assumption is not severe for the present purpose.

• We ignore interest rate risk: The discounting curve is deterministic and the discount factors are denoted by *DF*(*t*) := exp(− *t* <sup>0</sup> *r*(*s*) d*s*) with some given deterministic short rate function *r*(.). All presented negative basis figures are measurements relative to the applied short rate function *r*(.).

Under these assumptions we introduce the following notations:


$$\begin{split} &EDPL(\lambda(.),r(.),s(T),\operatorname{upf}(T),T) \\ &:= \operatorname{upf}(T) + s(T)\sum\_{0 t\_i^{(C)}\right), \\ &= \operatorname{upf}(T) + s(T)\sum\_{0$$

• The expected discounted value of the sum of all default compensation payments to be made by the CDS protection seller (the default/protection leg) is denoted by

$$\begin{aligned} EDDL(\lambda(.), r(.), R, T) &:= (1 - R) \mathbb{E}[1\_{\{\tau \le T\}} DF(\tau)] \\ &= (1 - R) \int\_0^T DF(\mathbf{y}) \, \lambda(\mathbf{y}) \, e^{-\int\_0^{\tau} \lambda(\mathbf{s}) \, \mathbf{d} \mathbf{y}} \, \mathbf{d} \mathbf{y}. \end{aligned}$$

<sup>4</sup>Interestingly, a mismatch between bond and CDS recovery is often favorable for the negative basis investor, since the CDS recovery rate tends to be lower than the bond recovery, see, e.g., [14]. Thus, it might make sense for a negative basis investor to opt for cash settlement of the CDS and sell his bonds in the marketplace, speculating on a favorable recovery mismatch.

<sup>5</sup>See http://www2.isda.org/asset-classes/credit-derivatives/.

<sup>6</sup>For the sake of notational convenience we ignore accrued interest upon default, which can, of course, be incorporated easily.

• The model price of the bond is given by

$$\begin{split} \operatorname{Bond}(\lambda(.),r(.),R,C,T) &:= C \sum\_{0 t\_{j}^{(B)}\right) \\ &\quad \quad \quad + DF(T) \operatorname{Q}(\tau > T) + R \operatorname{E}[\operatorname{I}\_{\{\tau \leq T\}} DF(\tau)] \\ &= C \sum\_{0$$

# **4 Traditional Measurements**

# *4.1 The Z-Spread Methodology*

The main idea of the *Z-spread methodology* is to define the negative basis as the difference between (expected) annualized bond earnings and annualized protection costs. This method is described, e.g., in [8]. The negative basis *NB*(*Z*) is computed by the following algorithm.

**Definition 1** (*Negative Basis (Z)*)


$$\text{l.x} \mapsto \text{Bond}(\mathfrak{x}, r(.), 0, C, T) - B,\tag{1}$$

<sup>7</sup>If CDS prices are quoted in running spreads with zero upfronts, then these quotes typically come naturally equipped with a recovery assumption that is required in order to convert the running spreads into actually tradable standardized coupon and upfront payments. However, after this conversion the recovery rate is a free model parameter.

<sup>8</sup>For a reader-friendly explanation of the Z-spread see [28]. In particular, it is useful to observe that Bond(*x*,*r*(.), *R*,*C*, *T*) = Bond(0,*r*(.) + *x*, *R*,*C*, *T*) for *R* = 0, implying that the Z-spread equals a constant default intensity under a zero recovery assumption.

if existent. In words, the Z-spread is the amount by which the reference short rate *r*(.) needs to be shifted parallelly in order for the discounted bond cash flows to match the market quote. The root, whenever existing at all, is unique.

4. The (zero-upfront) running CDS spread *s*(*T*) for a CDS contract, whose maturity matches the bond's maturity, is defined as

$$s(T) := \frac{EDDL(\lambda(.), r(.), R, T)}{EDPL(\lambda(.), r(.), 1, 0, T)},$$

i.e. the fair running spread when no upfront payment is present.

$$\mathsf{S}.\ \ N\mathsf{B}^{(\mathsf{Z})}:=z-s(T).$$

Intuitively, the Z-spread *z* is a measure of the annualized excess return of the bond on top of the "risk-free" rate *r*(.), whereas *s*(*T*) is the annualized CDS protection cost. Hence, *NB*(*Z*) equals the difference between earnings and costs (expected in case of survival). If the function (1) does not have a root in (0,∞), this means that the bond is less risky than the default risk intrinsic in the chosen discounting curve *r*(.). Especially since the liquidity crisis, when the interbank money transfer ran dry, significant spreads between discounting curves obtained from overnight rates and LIBOR-based swap rates are observed. Consequently, one could recognize, e.g., German government bonds with a "negative Z-spread" with respect to the interest rate curve *r*(.), which was obtained from 6-month EURIBOR swap rates. For such reasons it has become market standard to extract the "risk-free" discounting curve from overnight rates rather than from LIBOR-based swap rates. Moreover, [19] point out that the difference between bond yields and CDS spreads can depend on whether treasury rates or swap rates are used for discounting. Since negative basis investors are typically trading in the high yield sector, the function (1) normally does have a root in (0,∞) for several canonical choices of *r*(.), be it extracted from swap rates with overnight tenor, 3-month tenor, or 6-month tenor. But it is important to stress that all presented negative basis measurements are always relative measures depending on the applied interest rate curve *r*(.).

The Z-spread methodology has some drawbacks:


until maturity. Upon a default event the PnL of the position might be considerably different, depending on the timing of the default, see Fig. 1 in Example 1 below. Hence, the assumed CDS hedge cannot really be considered to be default-risk eliminating (it might either profit from or lose on a default event), and consequently the number *NB*(*Z*) does not deserve to be called a return figure after elimination of default risk, which the negative basis should be in our opinion.

# *4.2 The Par-Equivalent CDS Methodology*

The *par-equivalent CDS methodology* is described in the Appendix of [2]. A similar idea is also outlined in [8, p. 101 ff] and [3]. The negative basis *NB*(*PE*) is computed along the steps of the following algorithm.

### **Definition 2** (*Negative Basis (PE)*)


$$s(T) := \frac{EDDL(\lambda(.), r(.), R, T)}{EDPL(\lambda(.), r(.), 1, 0, T)},$$

i.e. the fair running spread when no upfront payment is present.

4. Denoting by *B* the quoted market price of the bond, a shift *z*˜ is defined as the root of the function

$$\mathbf{x} \mapsto \mathbf{B} \text{and} (\lambda(.) + \mathbf{x}, r(.), R, C, T) - B,$$

if existent. In words, the bond is priced with the default intensities λ(.) that are consistent with CDS quotes, which are then shifted parallelly until the bond's market quote is matched.

5. A second (zero-upfront) running CDS spread *s*˜(*T*) for a CDS contract, whose maturity matches the bond's maturity, is defined as

$$
\tilde{s}(T) := \frac{EDDL(\lambda(.) + \tilde{z}, r(.), R, T)}{EDPL(\lambda(.) + \tilde{z}, r(.), 1, 0, T)},
$$

i.e. the fair spread when no upfront payment is present, but now with the shifted intensity rates λ(.) + ˜*z*, which are required in order to price the bond correctly.

6. *NB*(*PE*) := ˜*s*(*T*) − *s*(*T*).

The main idea of (PE) is to question the default probabilities bootstrapped from the given CDS quotes, and to adjust them in order to match the bond quote. On a high level, this negative basis measurement is based on the difference between default probabilities that are required in order to match the bond price and default probabilities that are required in order to fit the CDS quotes.

The methodology (PE) has some drawbacks:


# **5 An Innovative Methodology**

In our opinion, the negative basis should be a spread on top of a reference discounting curve which can be earned without exposure to default risk. This means we question the usual assumption that the applied discounting curve *r*(.) is the appropriate riskfree rate to be used, because there is actually a higher rate that can be earned "riskfree" (recalling that default risk is the only risk within our tiny model). This motivates what we call the *hidden yield approach*. The negative basis *NB*(*HY*) is computed along the steps of the following algorithm.

#### **Definition 3** (*Negative Basis (HY)*)


$$\forall x \mapsto \text{Bond}(\lambda\_x(.), r(.) + x, R, C, T) - B.$$

In words, *NB*(*HY*) is precisely the parallel shift of the reference short rate *r*(.) which allows for a calibration such that the model prices of bond and CDS match the observed market quotes for bond and CDS.

<sup>9</sup>Lemma A.1 in the Appendix guarantees that this root typically exists and is unique.

The idea of method (HY) can also be summarized as follows: If the risk-free interest rate curve is assumed to be *r*(.) + *NB*(*HY*) , then the market quotes for bond and CDS are arbitrage-free (as we have found a corresponding pricing measure). It allows for the intuitive interpretation of the negative basis as a spread earned on top of a reference discounting rate after elimination of default risk. Abstractly speaking, assuming no transaction costs and availability of CDS protection at all maturities *T* > 0 (= perfect market conditions), arbitrage pricing theory suggests the existence of a trading strategy which buys the bond and hedges it via CDS, and which earns<sup>10</sup> precisely the rate *r*(.) + *NB*(*HY*) until the minimum of default time τ and bond maturity *T*. Since this way of thinking about *NB*(*HY*) is its distinctive property and highlights its intrinsic coherence with arbitrage pricing theory, the following lemma demonstrates by a heuristic argument how the rate *r*(.) + *NB*(*HY*) can be earned in a risk-free way.

**Lemma 1** (The rate *r*(.) + *NB*(*HY*) can be earned without default risk) *Assuming perfect market conditions, there exists a (static) portfolio, which is long the bond and invested in several CDS, which earns the rate r*(.) + *NB*(*HY*) *until* min{τ, *T*}*.*

*Proof (heuristic)* We denote by Q the probability measure under which τ has piecewise constant default intensity λ*NB*(*HY*) (.). We discretize the time interval [0, *T*] into *m* buckets 0 =: *t*<sup>0</sup> < *t*<sup>1</sup> <...< *tm* := *T*, but *m* may be chosen arbitrarily large such that the mesh of the discrete-time grid tends to zero as *m* tends to infinity. We introduce the following *m* + 1 probabilities:

$$\boldsymbol{w}\_{j}^{(m)} := \mathbb{Q}(\boldsymbol{\tau} \in (t\_{j-1}, t\_{j}]), \quad j = 1, \ldots, m, \quad \boldsymbol{w}\_{m+1}^{(m)} := \mathbb{Q}(\boldsymbol{\tau} > t\_{m}).$$

Now let τ (*m*) denote a random variable with distribution

$$\begin{aligned} \mathbb{Q}\left(\mathfrak{r}^{(m)} = \bar{t}\_j\right) &= \boldsymbol{w}\_j^{(m)}, \quad \bar{t}\_j := \frac{t\_{j-1} + t\_j}{2}, \quad j = 1, \ldots, m, \\\mathbb{Q}\left(\mathfrak{r}^{(m)} > t\right) &= \mathbb{Q}(\mathfrak{r} > t), \quad t \ge t\_m, \left(\text{in particular,}\mathbb{Q}(\mathfrak{r}^{(m)} > t\_m) = \boldsymbol{w}\_{m+1}^{(m)}\right). \end{aligned}$$

Notice that τ (*m*) ≈ τ in distribution, with the approximation improving with increasing *m*. In the sequel, we work with τ (*m*) , assuming that default during [0, *T*] can only take place at the possible realizations ¯*t*1,...,¯*tm* of τ (*m*) in [0, *T*]. We now consider a portfolio of *m* + 1 instruments, namely the bond and one CDS for each maturity *t*1,..., *tm*. We assume that the bond nominal is given by *N*0. Furthermore, *Ni* ∈ R denotes the nominal of the CDS with maturity *ti*. Negative nominal means that we sell the bond or sell CDS protection. Let's have a look at the following random variables, which are functions of τ (*m*) :

<sup>10</sup>By "earning" *<sup>r</sup>*(.) <sup>+</sup> *NB*(*HY*) we mean that the internal rate of return of the position is the reference rate *r*(.) plus a spread *NB*(*HY*) .

Negative Basis Measurement … 395

$$V^{(0)}\left(\boldsymbol{\tau}^{(m)}\right) := \left(r(.) + N\boldsymbol{\mathcal{B}}^{(HY)}\right)\text{-disconnected value of all cash flows from the bond,}$$

$$\text{when default takes place at }\boldsymbol{\tau}^{(m)},$$

$$\text{where}\\
(i) \ (\boldsymbol{\tau}^{(m)}) \ . \quad \left(\begin{array}{c} \dots \ \binom{\dots}{i} \ . \quad \text{All-} \ (HY) \end{array}\right) \text{-disconnected value of all cash flows from the }i\text{'s.}$$

*V*(*i*) τ (*m*) := *<sup>r</sup>*(.) <sup>+</sup> *NB*(*HY*) -discounted value of all cash flows from the CDS with maturity *ti*, when default takes place at <sup>τ</sup> (*m*) , *i* = 1,..., *m*.

All random variables *V*(*i*) τ (*m*) take on only *m* + 1 possible values, since their value on the event {τ (*m*) > *tm*} does not depend on τ (*m*) (as there are no cash flows after *tm*). So without loss of generality we may write *V*(*i*) τ (*m*) = *V*(*i*) ¯*tm*+<sup>1</sup> for some arbitrary ¯*tm*+<sup>1</sup> > *tm* on the event {τ (*m*) > *tm*}. Our goal is to show that it is possible to find a non-zero vector (*N*0,...,*Nm*) ∈ R*m*+<sup>1</sup> such that

$$\underbrace{N\_0 \, V^{(0)}\Big(\mathfrak{r}^{(m)}\Big) + \sum\_{i=1}^{m} N\_i \, V^{(i)}\Big(\mathfrak{r}^{(m)}\Big)}\_{\left(r(.) + N\mathcal{B}^{(HY)}\right)\text{-disconnected value of outcome}} + \underbrace{\sum\_{i=1}^{m} N\_i \, \text{upf}(t\_i)}\_{\text{initial investment amount}},\tag{2}$$

where *B* denotes the market bond price and upf(*ti*) the market upfront of the CDS with maturity *ti*. This mathematical statement intuitively means that the considered portfolio of bond and CDS earns the rate *r*(.) + *NB*(*HY*) until min{τ (*m*) , *T*} in a riskfree manner, regardless of the actual timing of the default. Now why is this possible? Considering the randomness on the left-hand side of Eq. (2), we actually have *m* + 1 equations for the *m* + 1 unknowns *N*0,*N*1,...,*Nm*. Rewriting Eq. (2) in terms of linear algebra, we obtain

$$
\begin{pmatrix}
V^{(0)}(\overleftarrow{\mathfrak{r}}\_{1}) - B & V^{(1)}(\overleftarrow{\mathfrak{r}}\_{1}) - \text{upf}(t\_{1}) & \dots & V^{(m)}(\overleftarrow{\mathfrak{r}}\_{1}) - \text{upf}(t\_{m}) \\
\vdots & & \ddots & & \vdots \\
\vdots & & \ddots & & \vdots \\
\vdots & & & \ddots & \vdots \\
V^{(0)}(\overleftarrow{\mathfrak{r}}\_{m+1}) - B & V^{(1)}(\overleftarrow{\mathfrak{r}}\_{m+1}) - \text{upf}(t\_{1}) \dots & V^{(m)}(\overleftarrow{\mathfrak{r}}\_{m+1}) - \text{upf}(t\_{m})
\end{pmatrix}
\begin{pmatrix}
N\_{0} \\
N\_{1} \\
\vdots \\
N\_{m}
\end{pmatrix} = 
\begin{pmatrix}
0 \\
0 \\
\vdots \\
0
\end{pmatrix}.
\tag{3}
$$

In order to prove the existence of a non-trivial solution (*N*0,...,*Nm*) to Eq. (3), it suffices to verify that the associated (*m* + 1) × (*m* + 1)-matrix does not have full rank. Now here enters the essential heuristic argument: it follows from the definition of *NB*(*HY*) that

$$\begin{aligned} \sum\_{j=1}^{m+1} \boldsymbol{w}\_j^{(m)} \, \boldsymbol{V}^{(0)}(\overline{t}\_j) &\approx B = \sum\_{j=1}^{m+1} \boldsymbol{w}\_j^{(m)} \, \boldsymbol{B},\\ \sum\_{j=1}^{m+1} \boldsymbol{w}\_j^{(m)} \, \boldsymbol{V}^{(i)}(\overline{t}\_j) &\approx \text{upf}(t\_i) = \sum\_{j=1}^{m+1} \boldsymbol{w}\_j^{(m)} \, \text{upf}(t\_i), \quad i = 1, \dots, m, \end{aligned}$$

with the approximations becoming equalities as *m* → ∞. In other words, this means that the rows of the equation system (3) are linearly dependent. Consequently, the associated matrix cannot have full rank and the columns must also be linearly dependent, i.e. there exists a non-zero solution (*N*0,...,*Nm*) of Eq. (3), and hence (2), as desired. Finally, taking a close look at the structure of the involved cash flows, it is obvious that a solution must satisfy *N*<sup>0</sup> = 0. Without loss of generality we may hence set *N*<sup>0</sup> = 1 (because if (*N*0,...,*Nm*) is a solution, so is α (*N*0,...,*Nm*) for arbitrary α ∈ R). Concluding, the portfolio we have found is long the bond. 

We present an example that demonstrates how different the three presented measurements of negative basis can be in practice. The specifications are inspired by a real-world case.

*Example 1* We consider a bond with maturity *T* = 3.5 years paying a semi-annual coupon rate of *C* = 8.25 %. It trades far below par value, namely at *B* = 46.5 %. An almost maturity-matched CDS contract is available at an upfront value of upf(*T*) = 53 % with a running coupon of *s*(*T*) = 5 %, payed quarterly. This means a nominalmatched negative basis investment comes at a package price of 46.5 + 53 = 99.5 %, and pays a coupon rate of 8.25 − 5 = 3.25 % until default (however, the bond and CDS coupon payments have different frequencies and payment dates). In the sequel we assume a recovery rate of *R* = 20 %, and the reference rate *r*(.) is bootstrapped from 3-month tenor-based interest rate swaps according to the raw interpolation method described in [15, 16]. Because the bond trades far below par, the measurement (Z) is highly questionable and returns *NB*(*Z*) = −0.42 %, which is clearly not an appropriate measurement. As indicated earlier, improved versions of earnings and costs-measurements must be used in order to deal with such extreme situations of highly distressed bonds, but this lies outside the scope of the present article. The par-equivalent CDS methodology returns the measurement *NB*(*PE*) = 2.29 %, whereas the hidden yield methodology returns the significantly lower number *NB*(*HY*) = 1.18 %. While the authors are not aware of a strategy how to monetize the (PE)-measurement 2.29 %, Lemma 1 provides a clear interpretation for the (HY) measurement 1.18 % in terms of an internal rate of return that can be earned on top of the risk-free rate, when the negative basis investment is structured as indicated in the proof of Lemma 1.

Now if the described nominal-matched investment seems to earn a rate of 3.25 %, which equals a spread of around 1.75 % above the chosen reference rate *r*(.) in the present example, why is the measurement *NB*(*HY*) so low? Fig. 1 visualizes the discounted value of the sum over all cash flows from the nominal-matched investment in dependence of the default time. For instance, in case of survival until maturity, this value equals approximately 104 %, yielding a return (after discounting) of 5.61 % on the initial investment of 98.39 % (which equals the package price minus accrued CDS coupon, the bond accrued equals zero). Distributed on the 3.5-year investment horizon, this corresponds to a rate of approximately 1.6 % per annum. However, in case of a default just before the first or second bond coupon payment date the

described negative basis investment faces a loss. The additional short-dated CDSprotection required in order to hedge these potential losses decreases the earnings potential of the investment, which is accounted for in the (HY)-methodology, as explained in the proof of Lemma 1.

# **6 Conclusion**

We proposed an innovative measurement for the negative basis, denoted *NB*(*HY*) . Compared to traditional approaches, it is based on an arbitrage-free pricing model for the simultaneous pricing of the bond and the CDS, which provides a sound economic interpretation. Within a simple model with only default risk being present, the negative basis is perfectly explained as the spread on top of a reference interest rate curve *r*(.). It was pointed out how the rate *r*(.) + *NB*(*HY*) can be earned without exposure to default risk.

**Acknowledgements** The KPMG Center of Excellence in Risk Management is acknowledged for organizing the conference "Challenges in Derivatives Markets - Fixed Income Modeling, Valuation Adjustments, Risk Management, and Regulation".

# **Appendix: The algorithm in Definition 3 is well-defined**

The following technical lemma guarantees that Step 3 in Definition 3 admits a unique solution that can be found efficiently by means of a bisection routine.

**Lemma A.1** (Method (HY) is well-defined)

*(a) The function x* → *Bond*(λ*x*(.),*r*(.) + *x*, *R*,*C*, *T*) *is continuous.*


*Proof* We prove parts (a), (b), and (c) separately.


$$\begin{aligned} \text{Bond}(\lambda\_x(.), r(.) + x, R, C, T) &= e^{-\int\_0^T \lambda\_x(s) + r(s) + x \, \text{ds}} \\ &+ C \sum\_{0 < i\_j^{(B)} \le T} (t\_j^{(B)} - t\_{j-1}^{(B)}) \, e^{-\int\_0^{(B)} \lambda\_x(s) + r(s) + x \, \text{ds}} \\ &+ \frac{R}{1 - R} \, EDPL(\lambda\_x(.), r(.) + x, \, \text{upf}(T), T), \end{aligned}$$

where we have used *EDPL* = *EDDL* from the CDS boostrap. This shows that it suffices to check that the function

$$x \mapsto \lambda\_x(t) + x$$

is increasing for each fixed *t*, because all summands in the above bond formula are then obviously decreasing.

We proceed with an auxiliary observation. If τ<sup>1</sup> and τ<sup>2</sup> are two positive random variables with distribution functions *F*<sup>1</sup> and *F*2, satisfying *F*<sup>1</sup> ≥ *F*<sup>2</sup> pointwise on an interval (*T*,∞) and *F*<sup>1</sup> ≡ *F*<sup>2</sup> on [0, *T*], then E[*g*(τ1)] ≥ E[*g*(τ2)] for any bounded function *g* : (0,∞) → [0,*K*], which is non-increasing on (*T*,∞). To verify this,<sup>11</sup> define the non-decreasing function *h* := −*g* and use integration by parts:

<sup>11</sup>One says that τ<sup>1</sup> is less than τ<sup>2</sup> in the usual stochastic order, and the following computation is standard in the respective theory.

Negative Basis Measurement … 399

$$\begin{split} \mathbb{E}[\mathbf{g}(\tau\_{1})] &= -\int h \, \mathrm{d}F\_{1} = -\int\_{(0,T]} h \, \mathrm{d}F\_{1} - \int\_{(T,\infty)} h \, \mathrm{d}F\_{1} \\ &= -\int\_{(0,T]} h \, \mathrm{d}F\_{2} - \left( h(\infty) \, \underbrace{F\_{1}(\infty)}\_{=1=F\_{2}(\infty)} - h(T) \, \underbrace{F\_{1}(T)}\_{=F\_{2}(T)} - \int\_{(T,\infty)} F\_{1} \, \mathrm{d}h \right) \\ &\geq -\int\_{(0,T]} h \, \mathrm{d}F\_{2} - \left( h(\infty) \, F\_{2}(\infty) - h(T) \, F\_{2}(T) - \int\_{(T,\infty)} F\_{2} \, \mathrm{d}h \right) \\ &= -\int\_{(0,T]} h \, \mathrm{d}F\_{2} - \int\_{(T,\infty)} h \, \mathrm{d}F\_{2} = -\int\_{} h \, \mathrm{d}F\_{2} = \mathbb{E}[g(\tau\_{2})]. \end{split}$$

Now we proceed inductively over *k* = 1,..., *m* by showing that *x* → λ*x*(*t*) + *x* is non-decreasing for all fixed *t* ∈ (*Tk*−<sup>1</sup>, *Tk* ], i.e. that *x* → *yk* (*x*) + *x* is nondecreasing. We start the induction for *k* = 1. To this end, recall that *y*1(*x*) is the unique root of the equation

$$EDPL(\mathbf{y}\_1(\mathbf{x}), r(.) + \mathbf{x}, \mathbf{s}(T\_1), \mathbf{upf}(T\_1), T\_1) = EDDL(\mathbf{y}\_1(\mathbf{x}), r(.) + \mathbf{x}, R, T\_1).$$

For the sake of a more compact notation we denote the left-hand side of the last equation by *LHS*(*x*, *y*1(*x*)) and the right-hand side by *RHS*(*x*, *y*1(*x*)). Furthermore, we denote the value of both sides by *V*(*x*) := *LHS*(*x*, *y*1(*x*)) = *RHS*(*x*, *y*1(*x*)). Since all the summands of *LHS* depend on the function *x* → *x* + *y*1(*x*) in a monotonic way, it is obvious that *V*(*x*) is non-increasing in *x* if and only if the function *x* → *x* + *y*1(*x*) is non-decreasing. Hence, it suffices to prove that *V*(*x*) is non-increasing in *x*. To this end, we (obviously) observe with ε > 0 that

$$LHS(\mathbf{x} + \varepsilon, \mathbf{y}\_1(\mathbf{x})) \le LHS(\mathbf{x}, \mathbf{y}\_1(\mathbf{x})) = V(\mathbf{x}),\tag{4}$$

$$RHS(\mathbf{x} + \varepsilon, \mathbf{y}\_1(\mathbf{x})) \le RHS(\mathbf{x}, \mathbf{y}\_1(\mathbf{x})) = V(\mathbf{x}).\tag{5}$$

Furthermore, the function *y* → *LHS*(*x* + ε, *y*) is obviously strictly decreasing. Concerning the right-hand side, we denote by E*y*[*f*(τ )] the expectation over *f*(τ ) when the default time τ has an exponential distribution with parameter *y*. The function

$$\mathbf{y} \mapsto RHS(\mathbf{x} + \boldsymbol{\varepsilon}, \mathbf{y}) = (1 - R) \mathbb{E}\_{\mathbf{y}} \left[ e^{-\int\_0^{\mathbf{r}} r(s) + \mathbf{x} + \boldsymbol{\varepsilon} \, \mathrm{d}s} \mathbf{1}\_{\{\mathbf{r} \le T\_{\parallel}\}} \right]$$

is non-decreasing on the claimed interval by the auxiliary observation we have derived above (increasing *y* corresponds to increasing the distribution function of the default time τ pointwise12). We now distinguish two cases:

<sup>12</sup>Here, we have used that the function <sup>τ</sup> → exp(<sup>−</sup> τ <sup>0</sup> *r*(*s*) + *x* + ε d*s*) 1{τ≤*T*1} is non-increasing if *x* ≥ − inf{*r*(*t*) : *t* ≥ 0}.

(i) *LHS*(*x* + ε, *y*1(*x*)) ≤ *RHS*(*x* + ε, *y*1(*x*)): In this case *y*1(*x* + ε) ≤ *y*1(*x*), because otherwise we would observe the following contradiction:

$$\begin{split} LHS(\mathbf{x} + \boldsymbol{\varepsilon}, \mathbf{y}\_1(\mathbf{x} + \boldsymbol{\varepsilon})) &< LHS(\mathbf{x} + \boldsymbol{\varepsilon}, \mathbf{y}\_1(\mathbf{x})) \leq RHS(\mathbf{x} + \boldsymbol{\varepsilon}, \mathbf{y}\_1(\mathbf{x})) \\ &\leq RHS(\mathbf{x} + \boldsymbol{\varepsilon}, \mathbf{y}\_1(\mathbf{x} + \boldsymbol{\varepsilon})). \end{split}$$

This implies that

$$V(\mathbf{x} + \boldsymbol{\varepsilon}) = RHS(\mathbf{x} + \boldsymbol{\varepsilon}, \mathbf{y}\_1(\mathbf{x} + \boldsymbol{\varepsilon})) \le RHS(\mathbf{x} + \boldsymbol{\varepsilon}, \mathbf{y}\_1(\mathbf{x})) \stackrel{(5)}{\le} V(\mathbf{x}).$$

(ii) *LHS*(*x* + ε, *y*1(*x*)) > *RHS*(*x* + ε, *y*1(*x*)): In this case *y*1(*x* + ε) ≥ *y*1(*x*), because otherwise we would observe the following contradiction:

$$\begin{split} RHS(\mathbf{x} + \boldsymbol{\varepsilon}, \mathbf{y}\_1(\mathbf{x} + \boldsymbol{\varepsilon})) &\leq RHS(\mathbf{x} + \boldsymbol{\varepsilon}, \mathbf{y}\_1(\mathbf{x})) < LHS(\mathbf{x} + \boldsymbol{\varepsilon}, \mathbf{y}\_1(\mathbf{x})) \\ &\leq LHS(\mathbf{x} + \boldsymbol{\varepsilon}, \mathbf{y}\_1(\mathbf{x} + \boldsymbol{\varepsilon})). \end{split}$$

This implies that

$$V(\mathbf{x} + \boldsymbol{\varepsilon}) = LHS(\mathbf{x} + \boldsymbol{\varepsilon}, \mathbf{y}\_1(\mathbf{x} + \boldsymbol{\varepsilon})) \le LHS(\mathbf{x} + \boldsymbol{\varepsilon}, \mathbf{y}\_1(\mathbf{x})) \stackrel{(4)}{\leq} V(\mathbf{x}).$$

Concluding, *V*(*x*) is non-increasing in *x* and the induction start is finished. We proceed with the induction step, assuming that we already know that *x* + λ*x*(*t*) is non-decreasing in *x* for each fixed *t* ≤ *Tk*−1. To this end, recall that *yk* (*x*) is the unique root of the equation

$$EDPL(\lambda\_x(.), r(.) + \ge, s(T\_k), \text{upf}(T\_k), T\_k) = EDDL(\lambda\_x(.), r(.) + \ge, R, T\_k),$$

where *yk* (*x*) enters the equation as the function value of λ*x*(.) on the interval (*Tk*−<sup>1</sup>, *Tk* ]. The left-hand side of the last equation can be rewritten as follows, using the standard market convention of standardized CDS strike rates*s*(*Tk*−<sup>1</sup>) = *s*(*Tk* ) =: *s*:

$$\begin{split} &EDPL(\lambda\_{x}(.),r(.) + \ge, s, \text{upf}(T\_{k}), T\_{k}) \\ &= EDPL(\lambda\_{x}(.), r(.) + \ge, s, \text{upf}(T\_{k-1}), T\_{k-1}) + \text{upf}(T\_{k}) - \text{upf}(T\_{k-1}) \\ &+ s \sum\_{T\_{k-1} \prec t\_{i}^{(C)} \le T\_{k}} \left(t\_{i}^{(C)} - t\_{i-1}^{(C)}\right) e^{-\int\_{0}^{\binom{(C)}{2}} \lambda\_{s}(s) + r(s) + x \, \text{ds}}. \end{split}$$

Similarly, the right-hand side can be rewritten as follows:

$$\begin{aligned} EDDL(\lambda\_x(.), r(.) + \ge, R, T\_k) &= EDDL(\lambda\_x(.), r(.) + \ge, R, T\_{k-1}) \\ &+ (1 - R) \operatorname{y}\_k(\mathbf{x}) \int\_{T\_{k-1}}^{T\_k} e^{-\int\_0^t r(\mathbf{s}) + \mathbf{x} + \lambda\_x(\mathbf{s}) \, \mathbf{d} \mathbf{s}} \, \mathbf{d} .\end{aligned}$$

Since the values (*y*1(*x*), . . . , *yk*−1(*x*)) have been determined before, we may subtract the *EDDL* and *EDPL* with maturity *Tk*−<sup>1</sup> on both sides of the defining equation for *yk* (*x*), simplifying the latter to

$$\begin{aligned} & \operatorname{upf}(T\_k) - \operatorname{upf}(T\_{k-1}) + s \sum\_{T\_{k-1} < t\_i^{(C)} \le T\_k} \left( t\_i^{(C)} - t\_{i-1}^{(C)} \right) e^{-\int\_0^{t\_i^{(C)}} \lambda\_x(s) + r(s) + x \, \operatorname{ds} x} \\ & = \left( 1 - R \right) \operatorname{y}\_k(\mathbf{x}) \int\_{T\_{k-1}}^{T\_k} e^{-\int\_0^t r(s) + x + \lambda\_x(s) \, \operatorname{ds}} \, \operatorname{d}t. \end{aligned}$$

Again, we denote the left-hand side of the last equation by *LHS*(*x*, *y*1(*x*), . . . , *yk* (*x*)), and the right-hand side is denoted *RHS*(*x*, *y*1(*x*), . . . , *yk* (*x*)). Furthermore, we denote the value of both sides by

$$V(\mathbf{x}) := LHS(\mathbf{x}, \mathbf{y}\_1(\mathbf{x}), \dots, \mathbf{y}\_k(\mathbf{x})) = RHS(\mathbf{x}, \mathbf{y}\_1(\mathbf{x}), \dots, \mathbf{y}\_k(\mathbf{x})).$$

By induction hypothesis, the function *x* → *x* + λ*x*(*t*) is non-decreasing for each *t* ≤ *Tk*−1. With ε > 0 this obviously implies that

$$\begin{split} LHS(\mathbf{x}+\varepsilon, \mathbf{y}\_1(\mathbf{x}+\varepsilon), \dots, \mathbf{y}\_{k-1}(\mathbf{x}+\varepsilon), \mathbf{y}\_k(\mathbf{x})) \\ \leq LHS(\mathbf{x}, \mathbf{y}\_1(\mathbf{x}), \dots, \mathbf{y}\_k(\mathbf{x})) = V(\mathbf{x}), \\ RHS(\mathbf{x}+\varepsilon, \mathbf{y}\_1(\mathbf{x}+\varepsilon), \dots, \mathbf{y}\_{k-1}(\mathbf{x}+\varepsilon), \mathbf{y}\_k(\mathbf{x})) \\ \leq RHS(\mathbf{x}, \mathbf{y}\_1(\mathbf{x}), \dots, \mathbf{y}\_k(\mathbf{x})) = V(\mathbf{x}). \end{split} \tag{6}$$

Also, the function *y* → *LHS*(*x* + ε, *y*1(*x* + ε), . . . , *yk*−<sup>1</sup>(*x* + ε), *y*) is obviously non-increasing, whereas the function *y* → *RHS*(*x* + ε, *y*1(*x* + ε), . . . , *yk*−<sup>1</sup>(*x* + ε), *y*) is non-decreasing by a similar argument as in the induction start, namely: the right-hand side has the form<sup>13</sup>

*RHS*(*x* + ε, *y*1(*x* + ε), . . . , *yk*−<sup>1</sup>(*x* + ε), *y*) = (1 − *R*)E*<sup>y</sup> e*<sup>−</sup> τ <sup>0</sup> *<sup>r</sup>*(*s*)+*x*+<sup>ε</sup> <sup>d</sup>*<sup>s</sup>* <sup>1</sup>{τ∈(*Tk*−1,*Tk* ]} ,

which is non-decreasing in *y*. Why? Because an increase of *y* increases the distribution function of τ pointwise on [*Tk*−<sup>1</sup>,∞) but leaves it unchanged on [0, *Tk*−<sup>1</sup>], and the function τ → exp(− τ <sup>0</sup> *r*(*s*) + *x* d*s*) 1{τ∈(*Tk*−1,*Tk* ]} is clearly non-increasing on (*Tk*−<sup>1</sup>,∞) (so that our auxiliary observation above applies). Like in the induction start, showing that *x* → *x* + *yk* (*x*) is non-decreasing in *x* is equivalent to showing that *V*(*x*) is non-increasing in *x*. The remaining proof is now completely analogous to the induction start (this is an exercise we leave to the reader).

<sup>13</sup>Similar as in the induction start, we denote by <sup>E</sup>*y*[*f*(τ )] the expectation over *<sup>f</sup>*(τ ) when the default time has piecewise constant intensity with the level *y* on the piece (*Tk*−1, *Tk* ].

(c) Denoting by Q*<sup>x</sup>* the probability measure in dependence of the default intensities λ*x*(.), we have

$$\begin{aligned} \text{Bond}(\lambda\_x(.), r(.) + \ge, R, C, T) &:= C \sum\_{0 < t\_j^{(B)} \le T} DF\left(t\_j^{(B)}\right) \left(t\_j^{(B)} - t\_{j-1}^{(B)}\right) \mathbb{Q}\_x\left(\tau > t\_j^{(B)}\right) \\ &+ DF(T) \mathbb{Q}\_x\left(\tau > T\right) + R \mathbb{E}\_x[1\_{\{\tau \le T\}} DF(\tau)]. \end{aligned}$$

We know from the consistent CDS pricing that the appearing expectation can be replaced by the premium leg of the CDS, which allows to be estimated by the upfront, i.e.

$$\begin{aligned} \mathbb{E}\_{\mathbf{x}}[\mathbf{1}\_{\{\mathbf{r} \le T\}} DF(\mathbf{r})] &= \frac{R}{1 - R} \operatorname{EDPL}(\lambda\_{\mathbf{x}}(.), r(.) + \mathbf{x}, \mathbf{s}(T), \operatorname{upf}(T), T) \\ &\ge \frac{R}{1 - R} \operatorname{upf}(T), \end{aligned}$$

which in turn implies the claim.

**Open Access** This chapter is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, duplication, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, a link is provided to the Creative Commons license and any changes made are indicated.

The images or other third party material in this chapter are included in the work's Creative Commons license, unless indicated otherwise in the credit line; if such material is not included in the work's Creative Commons license and the respective action is not permitted by statutory regulation, users will need to obtain permission from the license holder to duplicate, adapt or reproduce the material.

# **References**


# **The Impact of a New CoCo Issuance on the Price Performance of Outstanding CoCos**

**Jan De Spiegeleer, Stephan Höcht, Ine Marquet and Wim Schoutens**

**Abstract** Contingent convertible bonds (CoCos) are new hybrid capital instruments that have a loss absorbing capacity which is enforced either automatically via the breaching of a particular CET1 level or via a regulatory trigger. The price performance of outstanding CoCos, after a new CoCo issue is announced by the same issuer, is investigated in this paper via two methods. The first method compares the returns of the outstanding CoCos after an announcement of a new issue with some overall CoCo indices. This method does not take into account idiosyncratic movements and basically compares with the general trend. A second model-based method compares the actual market performance of the outstanding CoCos with a theoretical model. The main conclusion of the investigation of 24 cases of new CoCo bond issues is a moderated negative effect on the outstanding CoCos.

**Keywords** Contingent convertibles · CoCo bonds · New issuance

# **1 Introduction**

Contingent convertible bonds or CoCo bonds are new hybrid capital instruments that have a loss absorbing capacity which is enforced either automatically via the breaching of a particular CET1 level or via a regulatory trigger. CoCos either convert

J. De Spiegeleer (B) · I. Marquet · W. Schoutens

Department of Mathematics, KU Leuven, Celestijnenlaan 200B,

Box 2400, 3001 Leuven, Belgium

e-mail: Jan.Spiegeleer@mac.com

I. Marquet e-mail: Ine.Marquet@wis.kuleuven.be

W. Schoutens e-mail: wim@schoutens.be

S. Höcht Assenagon GmbH, Prannerstraße 8, 80333 München, Germany e-mail: Stephan.Hoecht@assenagon.com

into equity or suffer a write-down of the face value upon the appearance of such a trigger event.

The financial crisis of 2007–2008 triggered an avalanche of financial worries for financial institutions around the globe. After the collapse of Lehman Brothers, governments intervened and bailed out banks using tax-payers money. Preventing such bail-outs in the future, and designing a more stable banking sector in general, requires both higher capital levels and regulatory capital of a higher quality. The implementation under the new regulatory frameworks like Basel III and Capital Requirement Directive IV (CRD IV) tries to achieve this in various ways, i.e. with the use of CoCo bonds (Basel Committee on Banking Supervision [1], European Commision [2]). CoCo bonds are allowed as new capital instruments by the Basel III guidelines. The Swiss regulators have forced their systemic important banks to issue large amounts of these instruments. Further, the European CRD IV which entered into force on 17 July 2013 enforces all new additional Tier 1 instruments to have CoCo features.

The specific design of a CoCo bond enhances the capital of a bank when it is in trouble in an automatic way. Hence, a loss-absorbing cushion is created with the aim to avoid or at least to reduce potential interventions using tax-payers' money.

The first CoCos have been issued in the aftermath of the credit crisis. In December 2009 Lloyds exchanged some of their old hybrid instruments into this new type of bonds in order to strengthen their capital position after the bank had been hit very hard due to the financial crisis of 2008. Since then a lot of other banks have been issuing CoCos and one is expecting that many will follow in the next years. The market of CoCos is currently above USD 100 bn and is expanding very rapidly.1

When an issuer has already some CoCos outstanding and is announcing the issuance of a new CoCo bond, there are at least two opposite forces at work. On one hand, a new issue means that the capital of the issuing institute is strengthened (at the additional Tier 1 or Tier 2 level). Due to the new issue, the losses in case of a future trigger event will be shared over a larger bucket and hence recovery rates are expected to be higher. On the other hand, there are the market dynamics and investors who often prefer to invest rather in the new CoCo than in the older ones. This can be just due to the fact that one prefers new things above old stuff, but also because one believes there is a basis spread to be earned on a new issuance. Some believe a new issuance is brought to the market with a certain discount, to attract investors and to make the whole capital raising exercise a success. Investors then will move out of the old bonds and ask for allocation in the new issue.

In this paper, we estimate the price impact on the outstanding CoCos via two methods. The first method compares the returns of the outstanding CoCo bonds after an announcement of a new issue with some overall CoCo indices. More precisely, we compare the performance with CS Contingent Convertible Euro Total Return index and the BofA Merrill Lynch Contingent Capital index. Here we basically compare the performance of the outstanding CoCos with the general market performance. However such a comparison does not take into account idiosyncratic movements; it

<sup>1</sup>Source: Bloomberg.

basically compares with the general market trend. The issuing company is nevertheless exposed to market dynamics. Its stock price, its credit worthiness etc. can exhibit different timely evolutions compared with the respective quantities of their competitors. This can be especially the case around capital raising announcements since then financial details of the company are published and discussed at, for example roadshows around the new issuance. Therefore, we also deploy a second methodology taking into account idiosyncratic movements. Using an equity derivatives model, we compare the actual market performance of the outstanding CoCo bonds, with a theoretical model performance taking into account idiosyncratic effects, like movements in the underlying stock, credit default spreads or volatilities. The model is derivatives based and is taking as such forward-looking expectations into account.

In total, we investigate 24 cases of new CoCo bond issues. The main conclusion of the investigation is that there is a moderated negative effect on outstanding CoCo bonds. This is confirmed by both methodologies and the impact is an underperformance of about 25–50 bps on average in between the announcement date and the issue date. An extra negative impact of 40 bps was observed in the 10 trading days after the issue.

The analysis in this paper is constrained to CoCo bonds only, but a similar study could be done for other types of bonds as well. A comparative study for corporate bonds was, e.g. done in Akigbe et al. [3], where the authors investigate the impact of 574 outstanding debt issues. The investigation was divided by different reasons of a new debt issue. A significant negative impact on the price of the outstanding debt and equity was observed in case the public debt securities were issued to finance unexpected cash flow shortfalls. No significant reaction was observed when the new debt issues were motivated by unexpected increase in capital expenditures, unexpected increase in leverage or expected refinancing of outstanding debt.

This paper is organized as follows. We first provide in the next section the details of the equity derivatives model. In Sect. 3, we provide details on the data set used and in particular overview the new issuances of a whole battery of issuers that are part of our study. Next, we report on the exact methodology and results of our comparison with other CoCo indices. The final part of that section reports and discusses the results of the Equity Derivatives model. The final section concludes.

# **2 The Equity Derivatives Model**

CoCos are hybrid instruments, with characteristics of both debt and equity. This gives rise to different approaches for pricing CoCos. Without considering the heuristic models, two main schools of thoughts exist, namely the structural models and marketimplied models. Structural models are based on the theory of Merton and can be found in Pennacchi [4] and Pennacchi et al. [5]. We will apply a market-implied model where the derivation is based on market data such as share prices, credit default spreads and volatilities. The models were introduced in a Black–Scholes framework in De Spiegeleer and Schoutens [6] and De Spiegeleer et al. [7]. Pricing CoCos under smile conform models can be found in Corcuera et al. [8]. Based on the Heston model, the impact of skew is discussed in De Spiegeleer et al. [9]. In De Spiegeleer et al. [10] the implied CET1 volatility is derived from the market price of a CoCo bond. Further extensions and variations can be found in De Spiegeleer and Schoutens [11, 12], Corcuera et al. [13], De Spiegeleer and Schoutens [14], Cheridito and Zhikai [15], Madan and Schoutens [16].

The actual valuation of a CoCo incorporates the modelling of both the trigger probability and the expected loss for the investor. Notice that the trigger is defined by a particular CET1 level or decided upon a regulator's decision. Since these trigger mechanisms are hard to model or even quantify, we project the trigger into the stock price framework as considered in the equity derivatives model of De Spiegeleer and Schoutens [6]. This means that the CoCo will be triggered under the model once the share price drops below a specified barrier level, denoted by *S*-. We infer from existing CoCo market data the share price at the moment the CoCo bond gets triggered and we will call this the (implied) trigger level. As a result the valuation of a CoCo bond is transformed into a barrier-pricing exercise in an equity setting.

Under such a framework the CoCo bond can be broken down to several different derivative instruments. In first place, the CoCo behaves like a standard (nondefaultable) corporate bond where the holder will receive coupons *ci* on regular time points *ti* together with the principal *N* at maturity *T* . However, in case the share price drops below the trigger level *S*-, the investor will lose his initial investment and all future coupons. This will be modelled by short positions in binary down-and-in (BIDINO) options with maturities *ti* for each coupon *ci* and a BIDINO with maturity *T* to model the cancelling of the initial value. After the trigger event has occurred, the investor of a conversion CoCo will receive *Cr* shares. We can model this with *Cr* down-and-in asset-(at hit)-or-nothing options on the stock. For a write-down CoCo, the investor does not receive any shares and we can just set *Cr* equal to zero in this case. Therefore, the price of a CoCo can be calculated with the following formula:

*P* = Corporate bond

−*N* × binary down-and-in option − - *i ci* × binary down-and-in option + *Cr* × down-and-in asset-(at hit)-or-nothing option on the stock

Under the Black–Scholes model, we can find an explicit formula for the price of the CoCo at time *t*:

$$\begin{split} P &= N \exp(-r(T-t)) + \sum\_{l=1}^{k} c\_{l} \exp(-r(t\_{l}-t)) \\ &- N \times \exp(-r(T-t)) [\varPhi(-\mathbf{x}\_{l} + \sigma \sqrt{T-t}) + (S^{\mathsf{A}}/S)^{2\lambda - 2} \varPhi(\mathbf{y}\_{l} - \sigma \sqrt{T-t})] \\ &- \sum\_{l} c\_{l} \times \exp(-r(t\_{l}-t)) [\varPhi(-\mathbf{x}\_{l\bar{l}} + \sigma \sqrt{t\_{l}-t}) + (S^{\mathsf{A}}/S)^{2\lambda - 2} \varPhi(\mathbf{y}\_{l\bar{l}} - \sigma \sqrt{t\_{l}-t})] \\ &+ C\_{l} \times S^{\mathsf{A}} \left[ \left(\frac{S^{\mathsf{A}}}{S}\right)^{a+b} \varPhi(z) + \left(\frac{S^{\mathsf{A}}}{S}\right)^{a-b} \varPhi(z - 2b\sigma\sqrt{T-t}) \right] \end{split} \tag{1}$$

with

$$\begin{aligned} z &= \frac{\log(S^\*/S)}{\sigma\sqrt{T-t}} + b\sigma\sqrt{T-t} & \mathbf{x}\_1 &= \frac{\log(S/S^\*)}{\sigma\sqrt{T-t}} + \lambda\sigma\sqrt{T-t} \\ a &= \frac{r - q - \frac{1}{2}\sigma^2}{\sigma^2} & \mathbf{y}\_1 &= \frac{\log(S^\*/S)}{\sigma\sqrt{T-t}} + \lambda\sigma\sqrt{T-t} \\ b &= \frac{\sqrt{(r - q - \frac{1}{2}\sigma^2)^2 + 2r\sigma^2}}{\sigma^2} & \mathbf{x}\_{1i} &= \frac{\log(S/S^\*)}{\sigma\sqrt{t\_i - t}} + \lambda\sigma\sqrt{t\_i - t} \\ \lambda &= \frac{r - q + \sigma^2/2}{\sigma^2} & \mathbf{y}\_{1i} &= \frac{\log(S^\*/S)}{\sigma\sqrt{t\_i - t}} + \lambda\sigma\sqrt{t\_i - t} \end{aligned}$$

where Φ is the cdf of a standard normal distribution, *r* is the risk free rate, *q* the dividend yield and σ the volatility.

Applying this equity derivatives pricing model, a CoCo price can be found for a trigger level *S*-. However, the other way around is often more interesting. Knowing the market CoCo price, we can filter out an implied trigger *S* in such a way that market and model price match. Since CoCos of one financial institution with the same contractual trigger should trigger at the same time, their implied trigger levels should theoretically also be the same. Hence the implied barriers give us a way to compare different CoCos in order to detect over- or undervaluation, irrespectively of different currencies and maturities.

Our goal is to compare the actual market performance of the outstanding CoCo bonds with the theoretical model performance. This theoretical price takes idiosyncratic effects into account. Any changes in the actual market performance compared to the theoretical model performance will be described to the effect of the announcement of a new CoCo issuance. The research can also be translated in terms of implied trigger levels. In case the new CoCo does not influence the outstanding CoCo, the implied barrier of the outstanding CoCo should remain constant. Whereas if its implied barrier derived from the market will change, this change will be caused by the new CoCo issuance.

# **3 Measuring the Price Performance of the Outstanding CoCos**

# *3.1 New Issuances*

The impact of a new CoCo issuance is investigated on the outstanding CoCos of the same issuing company. The issuers in our study contain UBS, Barclays, Crédit Agricole, Sociéty Général, Deutsche Bank, UniCredit, Credit Suisse, Santander, Rabobank, Danske and BBVA. The effect on the outstanding CoCos is investigated in the period between announcement and issuance of the new CoCo, which are summarised in Table 1. Notice that UBS, Barclays and Crédit Agricole all have


**Table 1** Announcement date, issue date and issue size (in bn) of the new CoCos

(continued)


**Table 1** (continued)

aIncl. XS1055037920

bIncl. US06738EAB11 and XS1068574828 cIncl. CH0271428333 and CH0271428309

*Source* Bloomberg/own calculations

issued different CoCos on the same day. Since it is not possible to distinguish their influence from each other, these new CoCos are assumed to have one general impact on all the outstanding CoCos of the same issuing company.

# *3.2 CoCo Index Comparison*

The first analysis is based on indices as a benchmark to observe a certain impact. It basically compares the returns of the outstanding CoCo bonds after an announcement of a new issue with some overall CoCo indices. More precisely, we compare the performance with the CS Contingent Convertible Euro Total Return index and the BofA Merrill Lynch Contingent Capital Index (whenever the data is available). The methods are explained for one particular new CoCo issuance, namely the USF22797YK86 CoCo of Crédit Agricole. In the end, the overall results and conclusions are shown.

#### **3.2.1 Method**

In a first step, we analyse the impact of each new CoCo separately on all the outstanding CoCos of the same issuer. The simple returns are derived for the outstanding CoCos during the period between the announcement date and the issue date of the new CoCo. In a second step, we accumulate these simple returns and obtain the returns between announcement and issue date. As an example, the first steps are shown for two outstanding CoCos of Crédit Agricole in Fig. 1.

On each day, we calculate the (equally weighted) average of the cumulative simple returns of all outstanding CoCos. In a last step, we take the difference between these averages and the cumulative returns of the CoCo index on each day between the announcement date and issue date of the new CoCo.

**Fig. 1** Impact of USF22797YK86. **a** Daily returns. **b** Cumulative returns

#### **3.2.2 Results**

In Table 2, the difference in cumulative returns over the observation period, meaning the period between announcement and issuance, is shown. For some observation periods, the Merrill Lynch index did not yet exist. When the CoCo does show a significant change compared to the global index, we can assume that this change is due to the new CoCo issuance. The averaged difference over all new CoCos analysed is shown in Fig. 2. These averaged differences in cumulative returns are shown for one day until five days after the announcement of the new CoCo and also over the full period as was given in Table 2. As a conclusion, we see that on average the outstanding CoCos get a negative impact of around 25 bps on their return between announcement and issuance due to a new CoCo.

Multiple CoCo indices can be used for this analysis but CoCo indices are relatively new on the market. As such we are obliged to restrict our analysis to indices already available during the period of each analysis. Remark also that we need to handle these indices with care, in the sense that the indices are applied to give a global market view on the CoCos. A point of criticism to this approach can be that the indices are not that representative for the true market. There is also high concentration on some issuers in the indices, e.g. for the ML index the top 5 issuers almost make 50% of the index (as of December 2014).

Furthermore, this comparison with the general market performance does not take into account idiosyncratic movements but compares with the general market trend. The issuing company is nevertheless exposed to individual dynamics. Its stock price, its credit worthiness, etc. can change differently from their competitors. This can be especially the case around capital raising announcements since then financial details of the company are published and discussed at, for example road-shows around the new issuance. Therefore, we move on to a second methodology taking into account idiosyncratic movements.

**Table 2** Averaged difference in cumulative returns (in %) between the outstanding CoCos and the Credit Suisse CoCo index (left column) and Merrill Lynch CoCo index (right column) over the observation period of the new CoCo


*Source* Bloomberg/own calculations

**Fig. 2** Averaged difference in cumulative returns between the outstanding CoCos and the Credit Suisse and Merrill Lynch CoCo index

# *3.3 Model-Based Performance*

As experienced by all CoCo investors, the difficulty in these financial products lies in their different characteristics which are hard to compare like the trigger type, conversion type, maturity, coupon cancellation, and so on. However, the implied barrier methodology can be used as a tool to compare CoCos with different characteristics. In this second approach, we will use the implied barrier to derive theoretical values for the outstanding CoCos under the assumption of no impact by the new CoCo issuance and compare them with the actual market values.

#### **3.3.1 Method**

The implied barrier can be interpreted as the stock price level (assumed by the market) that is hit (for the first time) when the CoCo gets converted or written down. If nothing changes, the market will keep the same idea about the implied barrier level and hence result in a constant implied barrier over time. In other words, when there is no impact due to this new CoCo, no change could theoretically be observed in

**Fig. 3** Impact of USF22797YK86. **a** Implied barriers. **b** CoCo quotes. **c** Returns compared with announcement. **d** Difference in returns

the implied barrier. As such we can see in the levels of the implied barrier if there is an impact due to the announced new CoCo. This leads us easily to the second approach of our impact analysis. As an example, we show the implied barriers of the two outstanding CoCos of ACAFP from the previous section in Fig. 3a.

As from the previous section, the implied barriers can be translated into CoCo quotes. The theoretical CoCo price does not take any information of a new CoCo issuance into account by assuming a constant implied barrier. These values can be used as our reference. Any change in the market compared with this reference, is then due to the impact of the announcement of a new CoCo issuance. As such we can calculate the theoretical CoCo prices from a constant implied barrier and compare them with the market values. The results of our CoCo examples are shown in Fig. 3b. As a last step, we define cheapness as the difference between the market CoCo return and the theoretical CoCo return until the announcement date. In Fig. 3, the cheapness of the two outstanding CoCos of Crédit Agricole is shown.


#### **3.3.2 Results**

An overall view is derived for the cheapness by averaging the differences in theoretical and market CoCo prices for each outstanding CoCo during the observation period of the new CoCo. We averaged the differences of all the CoCos on one day until five days after the announcement and also on the issue date of the new CoCo (Table 3).

Clearly, from Fig. 4, on average the cheapness on each day of our observation period is negative, meaning that market price is below the theoretical price assuming no impact. As such we conclude also from this approach that there is a negative impact of about 42 bps on average on the outstanding CoCos when a new CoCo issuance is announced.

# **4 Impact After Issue Date**

At this point, we investigated the impact of a new CoCo issuance between the announcement and issue date. In this section, we show the results for a longer observation period. More concrete, both analyses are extended to 10 trading days after the issue date.

From our first analysis, where we compare the outstanding CoCos with the CoCo indices, a downward trending impact is observed in Fig. 5a. The second analysis which compares the market and model prices of the outstanding CoCos is shown in Fig. 5b. In both analyses, the negative impact gets more significant after the issue dates. Hence until 10 trading days after the issue date, there is still a negative impact observable.

# **5 Conclusion**

The price performance of outstanding CoCos was investigated after a new CoCo issue is announced by the same issuer. Based on two approaches, we estimated the price impact on the outstanding CoCos. The first method compared returns of the outstanding CoCos with some overall CoCo indices. As a conclusion, we found that the return of the outstanding CoCos, during the period between announcement and issuance, was slightly lower than the returns of the CoCo indices. There was an underperformance of about 22 bps compared with the Credit Suisse index and about 42 bps with the Merrill Lynch index (although with relative high standard deviations). Since this first study did not take idiosyncratic movements into account, we used also a second method based on the equity derivatives model for CoCos. In this method we compared the actual market performance of the outstanding CoCo bonds with a theoretical model performance taking into account idiosyncratic effects, like movements in the underlying stock, credit default spreads and volatilities. This second approach also concludes that the averaged market returns of the outstanding CoCos were about 42 bps lower than one would expect in case of no influence.

In total, we investigate 24 cases of new CoCo bond issues. The main conclusion of the investigation is that there is a moderated negative effect on outstanding CoCo bonds. This is confirmed by both methodologies and the impact is an underperformance of about 20–40 bps on average in between the announcement date and the issue date. During the period of 10 trading days after the issue date, an extra decrease of 40 bps was observed.

**Acknowledgements** The authors would like to thank Robert Van Kleeck and Michael Hünseler for useful discussions on the topic.

The KPMG Center of Excellence in Risk Management is acknowledged for organizing the conference "Challenges in Derivatives Markets - Fixed Income Modeling, Valuation Adjustments, Risk Management, and Regulation".

**Open Access** This chapter is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, duplication, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, a link is provided to the Creative Commons license and any changes made are indicated.

The images or other third party material in this chapter are included in the work's Creative Commons license, unless indicated otherwise in the credit line; if such material is not included in the work's Creative Commons license and the respective action is not permitted by statutory regulation, users will need to obtain permission from the license holder to duplicate, adapt or reproduce the material.

# **References**


# **The Impact of Cointegration on Commodity Spread Options**

**Walter Farkas, Elise Gourier, Robert Huitema and Ciprian Necula**

**Abstract** In this work we explore the implications of cointegration in a system of commodity prices on the premiums of options written on various spreads on the futures prices of these commodities. We employ a parsimonious, yet comprehensive model for cointegration in a system of commodity prices. The model has an exponential affine structure and is flexible enough to allow for an arbitrary number of cointegration relationships. We conduct an extensive simulation study on pricing spread options. We argue that cointegration creates an upward sloping term structure of correlation, that in turn lowers the volatility of spreads and consequently the price of options on them.

**Keywords** Cointegration · Futures prices · Commodities · Spread options · Simulation

# **1 Introduction**

A distinctive feature of commodity markets is the existence of long-run equilibrium relationships that exist between the levels of various commodity prices, such as the one between the price of crude oil and the price of heating oil. These long-run

Department of Banking and Finance, University of Zurich, Plattenstrasse 14, 8032 Zurich, Switzerland e-mail: walter.farkas@bf.uzh.ch

W. Farkas Swiss Finance Institute, Zurich, Switzerland

W. Farkas (B) · R. Huitema · C. Necula

W. Farkas Department of Mathematics, ETH Zurich, Rämistrasse 101, 8092 Zurich, Switzerland

E. Gourier School of Economics and Finance, Queen Mary, University of London, London, UK

C. Necula Department of Money and Banking, Bucharest University of Economic Studies, Bucharest, Romania

equilibrium relations can be captured in economic models by so-called *cointegration* relations.

In this work we employ the continuous time model of cointegrated commodity prices developed by the authors in Farkas et al. [6] in order to conduct a simulation study for assessing the impact of cointegration on spread options. In our model, commodity prices are non-stationary and several cointegration relations are allowed amongst them, capturing long-run equilibrium relationships. Cointegration (Engle and Granger [5]) is the property of two or more non-stationary time series of having at least one linear combination that is stationary.

There is a vast literature on modeling the price of a single commodity as a nonstationary process (see Back and Prokopczuk [1] for a comprehensive recent review). For example, Schwartz and Smith [13] assume the log price of a commodity to be the sum of two latent factors: the long-term equilibrium level, modeled as a geometric Brownian motion, and a short-term deviation from the equilibrium, modeled as a zero mean Ornstein–Uhlenbeck (OU) process. More recently, Paschke and Prokopczuk [11] propose to model these deviations as a more general CARMA process and Cortazar and Naranjo [3] generalize the Schwartz and Smith [13] model in a multifactor framework.

However, the literature on modeling a system of commodity prices is still quite scarce. Two fairly recent models are proposed in Cortazar et al. [4] and Paschke and Prokopczuk [10], both of which account for cointegration by incorporating common and commodity-specific factors into their modeling framework. Amongst the common factors, only one is assumed non-stationary. Although they explicitly take into account cointegration between prices, the cointegrated systems generated by these two models are not covering the whole range of possible number of cointegration relations, but allow for none or for exactly *n* − 1 relations to exist between the *n* prices. In Farkas et al. [6] we propose an easy-to-use, yet comprehensive, model for a system of cointegrated commodity prices that retains the exponential affine structure of previous approaches and allows, in the same time, for an arbitrary number of cointegration relationships.

The rest of the work is organized as follows. In Sect. 2 we briefly describe the model proposed in Farkas et al. [6] and point out some qualitative aspects regarding the dynamics of the system. Section 3 is devoted to an extensive simulation study focused on computing spread options prices and on assessing the impact of cointegration on pricing spread options. Section 4 is reserved for concluding remarks.

# **2 Outline of the Model**

Before proceeding to the simulation study, in this section we present for the sake of completeness, a short description of the model developed in Farkas et al. [6].

Consider *n* commodities with spot prices **S**(*t*) = (*S*1(*t*), . . . , *Sn*(*t*))-.

First it is assumed that the spot log-prices **X**(*t*) = log **S**(*t*) can be decomposed into three components:

$$\mathbf{X}(t) = \mathbf{Y}(t) + \mathbf{e}(t) + \boldsymbol{\phi}(t), \tag{1}$$

where **Y**(t) signifies the long-run levels, *ε*(*t*) is an *n*-dimensional stationary process capturing short-term deviations, and *φ*(*t*) = *χ*<sup>1</sup> cos(2π*t*) + *χ*<sup>2</sup> sin(2π*t*) controls for seasonal effects with *χ*<sup>1</sup> and *χ*<sup>2</sup> being *n*-dimensional vectors of constants.

The notion of cointegration (Engle and Granger [5], Johansen [7], Phillips [12]) refers to the property of two or more non-stationary time series of having a linear combination that is stationary. For example, if *X*1(*t*) and *X*2(*t*) are two non-stationary processes, one says that they are cointegrated if there is a linear combination of them, *X*1(*t*) − α*X*2(*t*), that is stationary for some positive real α. Intuitively, cointegration occurs when two or more non-stationary variables are linked in a long-run equilibrium relationship from which they might depart only temporarily.

Regarding cointegration in the model, we stress that *n* cointegration relationships are implicitly assumed by (1): the *n* seasonally adjusted spot log-prices **X**(*t*) − *φ*(*t*) are cointegrated with their corresponding long-run levels, **Y**(*t*), since the linear combination **X**(*t*) − *φ*(*t*) − **Y**(*t*) is stationary.

Secondly, cointegration is allowed to exist between the variables in **Y**(*t*) as well. We denote the number of cointegration relationships between them by *h*, where *h* ≥ 0 and *h* < *n*. The corresponding cointegration matrix is symbolized by Θ, an *n* × *n* matrix with the last *n* − *h* rows equal to zero vectors. Each of the *h* non-zero rows of Θ encodes a stationary (i.e., cointegrating) combination of the variables in **Y**(*t*), normalized such that Θ*ii* = 1, *i* ≤ *h*. The total *n* + *h* cointegration relationships between the variables in the vector **Z**(*t*) := (**X**(*t*) − *φ*(*t*), **Y**(*t*)) can be characterized by the (2*<sup>n</sup>* <sup>×</sup> <sup>2</sup>*n*)-matrix - **I***<sup>n</sup>* −**I***<sup>n</sup>* **O***<sup>n</sup>* Θ where **O***<sup>n</sup>* denotes the zero-matrix with dimension *n* × *n*.

The dynamics of **X**(*t*) and **Y**(*t*) under the real-world probability measure is assumed to be given by:

$$d\begin{bmatrix} \mathbf{X}(t) - \phi(t) \\ \mathbf{Y}(t) \end{bmatrix} = \begin{bmatrix} \mathbf{0}\_{n} \\ \boldsymbol{\mu}\_{\text{y}} \end{bmatrix} dt + \begin{bmatrix} -K\_{\text{x}} & \mathbf{O}\_{n} \\ \mathbf{O}\_{n} & -K\_{\text{y}} \end{bmatrix} \begin{bmatrix} \mathbf{I}\_{n} & -\mathbf{I}\_{n} \\ \mathbf{O}\_{n} & \boldsymbol{\Theta} \end{bmatrix} \begin{bmatrix} \mathbf{X}(t) - \phi(t) \\ \mathbf{Y}(t) \end{bmatrix} dt$$

$$+ \begin{bmatrix} \boldsymbol{\Sigma}\_{\text{x}}^{\frac{1}{2}} & \mathbf{O}\_{n} \\ \boldsymbol{\Sigma}\_{\text{xy}}^{\frac{1}{2}} & \boldsymbol{\Sigma}\_{\text{y}}^{\frac{1}{2}} \end{bmatrix} d \begin{bmatrix} \mathbf{W}\_{\text{x}}(t) \\ \mathbf{W}\_{\text{y}}(t) \end{bmatrix}, \tag{2}$$

where **0***<sup>n</sup>* is an *n*-dimensional vector of zeros, and **W** := (**W***<sup>x</sup>* ,**W***<sup>y</sup>* ) is a 2*n*dimensional standard Brownian motion. Furthermore, the matrix - −*Kx* **O***<sup>n</sup>* **O***<sup>n</sup>* −*Ky* measures the speed by which**Z**(*t*)reverts to its long-run (cointegration) equilibrium level. More specifically, *Kx* quantifies the speed of mean reversion of the elements in **X** around the long term levels in **Y**. The matrix *Ky* is an *n* × *n* matrix with the last *n* − *h* columns equal to zero vector, such that *Ky*Θ is an *n* × *n* matrix of rank *h*. Each of the *h* non-zero columns in *Ky* quantifies the speed of adjustment of each element in **Y** to the corresponding cointegration relation. The dynamics given by Eq. (2) is "error-correcting" in that a deviation from a given cointegration relation induces an appropriate change in variables in the direction of correcting the deviation.

**Fig. 1** Simulated price paths for various choices of theΘ matrix. *Top panel* prices are non-stationary and there is one cointegration relation. *Middle panel* prices are non-stationary and there is no cointegration. *Bottom panel* prices are stationary

In order to assess qualitatively the role of the cointegration matrix Θ on the properties of the dynamics of the system, Fig. 1 depicts the results of a simulation of a system of three variables for various choices of the Θ matrix.

In the top panel of Fig. 1 we assume that there is a cointegration relation and the first line of the Θ matrix is 1 1 −1 and, therefore, the residual of the cointegration relation, *Y*1(*t*) + *Y*2(*t*) − *Y*3(*t*), is stationary. On the other hand, in the middle panel, depicts the case when Θ is the null matrix and, therefore, the prices are non-stationary and not cointegrated. For example, the residual of the cointegration relation from the previous case, *Y*1(*t*) + *Y*2(*t*) − *Y*3(*t*), is no longer stationary. In fact, there is no stationary linear combination of the long run levels in this case. Moreover, as depicted in the bottom panel, the model also allows for stationary prices, if the Θ matrix is of full rank.

The characteristic functions of **X** and **Y** can be readily computed analytically given they are normally distributed since the dynamics of **Z**(*t*) = (**X**(*t*) − *φ*(*t*), **Y**(*t*)) is in fact given by a multivariate Ornstein–Uhlenbeck (OU) process:

$$d\mathbf{Z}(t) = \left[\boldsymbol{\mu} - K\mathbf{Z}(t)\right]dt + \boldsymbol{\Sigma}^{\frac{\mathsf{i}}{2}}d\mathbf{W}(t),\tag{3}$$

with *<sup>μ</sup>* := - **0***n μy* , *<sup>K</sup>* := - *Kx* −*Kx* **O***<sup>n</sup> Ky*Θ , Σ <sup>1</sup> <sup>2</sup> := Σ 1 2 *<sup>x</sup>* **O***<sup>n</sup>* Σ 1 2 *x y* <sup>Σ</sup> <sup>1</sup> 2 *y* , **<sup>W</sup>**(*t*) := - **W***<sup>x</sup>* (*t*) **W***<sup>y</sup>* (*t*) .

At the same time, the vector of spot prices **S**(*T* ) can be written as an exponential function of **X**(*t*) and **Y**(*t*):

$$\begin{split} \mathbf{S}(T) &= \exp\left[e^{-K\_{\mathcal{X}}(T-t)}\mathbf{X}(t) + \psi(T-t)\mathbf{Y}(t) + \left[\phi(T) - e^{-K\_{\mathcal{X}}(T-t)}\phi(t)\right] \right. \\ &\left. + \left[\int\_{t}^{T} \psi(T-u) du\right] \mu\_{\mathcal{Y}} + \int\_{t}^{T} \left[e^{-K\_{\mathcal{X}}(T-u)} \boldsymbol{\Sigma}\_{\mathcal{X}}^{\frac{1}{2}} + \psi(T-u)\boldsymbol{\Sigma}\_{\mathcal{X}}^{\frac{1}{2}}\right] d\mathbf{W}\_{\mathcal{X}}(u) \\ &\left. + \int\_{t}^{T} \psi(T-u) \boldsymbol{\Sigma}\_{\mathcal{Y}}^{\frac{1}{2}} d\mathbf{W}\_{\mathcal{Y}}(u) \right]. \end{split} \tag{4}$$

where

$$
\psi(\mathfrak{r}) := K\_x \left[ \int\_0^\mathfrak{r} e^{-K\_x(\mathfrak{r}-\mathfrak{u})} e^{-K\_\mathfrak{r} \Theta \mathfrak{u}} d\mathfrak{u} \right].
$$

Given the affine structure of the model, futures prices can also be obtained in closed form. Under the simplifying assumption of constant market prices of risk, one has that *d* - **W**∗ *<sup>x</sup>* (*t*) **W**<sup>∗</sup> *<sup>y</sup>* (*t*) = *d* - **W***<sup>x</sup>* (*t*) **W***<sup>y</sup>* (*t*) + - *λx λy dt* where **W**<sup>∗</sup> *<sup>x</sup>* (*t*) and **W**<sup>∗</sup> *<sup>y</sup>* (*t*) are standard Brownian motions under the risk-neutral measure, and *λ<sup>x</sup>* , *λ<sup>y</sup>* are the market prices of **W***<sup>x</sup>* (*t*) and **W***<sup>y</sup>* (*t*) risks, respectively.

Under these circumstances it can be shown that at time *t* the vector of futures prices for the contracts with maturity *T* is given by

$$\mathbf{F}(t,T) = \exp\left\{a(t,T) + \beta(T-t)\mathbf{X}(t) + \psi(T-t)\mathbf{Y}(t)\right\},\tag{5}$$

with β(τ ) := *e*−*Kx* <sup>τ</sup> and with α(*t*, *T* ) defined by

$$\alpha(t, t + \tau) := \left[\phi(t + \tau) - e^{-K\_x \tau} \phi(t)\right] - \left(\mathbf{I}\_n - e^{-K\_x \tau}\right) K\_x^{-1} \boldsymbol{\mu}\_x^\* + \left(\int\_0^\tau \psi(\tau - u) du\right) \boldsymbol{\mu}\_y^\* $$

$$ + \text{diag}\left\{\frac{1}{2} \left[\mathbf{I}\_n \cdot \mathbf{O}\_n\right] \left[e^{-K\_x \tau} \left(\int\_0^\tau e^{Ku} \boldsymbol{\Sigma} e^{Ku} du\right) e^{-K\_\tau \tau}\right] \right\} \boldsymbol{\left[\mathbf{I}\_n \cdot \mathbf{I}\right]}\right\},\tag{6}$$

**Fig. 2** The relative contribution of various components to log futures prices

where diag(*A*) returns the vector with diagonal elements of *A*, and where

$$
\mu^\* = \begin{bmatrix} \mu\_x^\* \\ \mu\_y^\* \end{bmatrix} = \mu + \Sigma^{\frac{l}{2}} \begin{bmatrix} \lambda\_x \\ \lambda\_y \end{bmatrix}.
$$

To better assess qualitatively the impact of the two factors, **X**(*t*) and **Y**(*t*), on the term structure of futures prices, we depict in Fig. 2 the relative contribution of the corresponding two terms in Eq. (5) to the logarithm of the futures prices on one of the commodities in a cointegrated system.

The contribution of the **X**(*t*) component decreases exponentially as a function of time to maturity. On the other hand, the **Y**(*t*) component contributes significantly for higher maturities. Therefore, the two factors capture the short-end and, respectively, the long-end of the term-structure of futures prices.

By Itô's lemma, the risk-neutral dynamics of **F**(*t*, *T* ) is given by

$$\frac{d\mathbf{F}(t,T)}{\mathbf{F}(t,T)} = \left[e^{-K\_{\rm x}(T-t)}\boldsymbol{\Sigma}\_{\rm x}^{\frac{\cdot}{2}} + \boldsymbol{\psi}(T-t)\boldsymbol{\Sigma}\_{\rm xy}^{\frac{\cdot}{2}}\right]d\mathbf{W}\_{\rm x}^\*(t) + \boldsymbol{\psi}(T-t)\boldsymbol{\Sigma}\_{\rm y}^{\frac{\cdot}{2}}d\mathbf{W}\_{\rm y}^\*(t),\tag{7}$$

and it follows immediately that the variance–covariance matrix of returns on futures prices is given by:

$$\begin{split} \boldsymbol{\Sigma}(\boldsymbol{\tau}) &= \boldsymbol{e}^{-K\_{\boldsymbol{x}}\boldsymbol{\tau}} \boldsymbol{\Sigma}\_{\boldsymbol{x}} \boldsymbol{e}^{-K\_{\boldsymbol{x}}^{\top}\boldsymbol{\tau}} + \boldsymbol{\psi}(\boldsymbol{\tau}) \boldsymbol{\Sigma}\_{\boldsymbol{x}\boldsymbol{y}}^{\frac{1}{2}} (\boldsymbol{\Sigma}\_{\boldsymbol{x}}^{\frac{1}{2}})^{\top} \boldsymbol{e}^{-K\_{\boldsymbol{x}}^{\top}\boldsymbol{\tau}} + \boldsymbol{e}^{-K\_{\boldsymbol{x}}\boldsymbol{\tau}} \boldsymbol{\Sigma}\_{\boldsymbol{x}}^{\frac{1}{2}} (\boldsymbol{\Sigma}\_{\boldsymbol{x}\boldsymbol{y}}^{\frac{1}{2}})^{\top} \boldsymbol{\psi}^{\top}(\boldsymbol{\tau}) \\ &+ \boldsymbol{\psi}(\boldsymbol{\tau}) \boldsymbol{\Sigma}\_{\boldsymbol{x}\boldsymbol{y}} \boldsymbol{\psi}^{\top}(\boldsymbol{\tau}) + \boldsymbol{\psi}(\boldsymbol{\tau}) \boldsymbol{\Sigma}\_{\boldsymbol{y}} \boldsymbol{\psi}^{\top}(\boldsymbol{\tau}) \end{split} \tag{8}$$

where τ = *T* − *t*.

Since the term structure of correlation of futures prices returns plays an important role in the results of the simulations performed in the following section, it is worthwhile to point out some qualitative results about this term structure.

First, Eq. (8) shows that unless *Kx* = **O***n*, the variance–covariance matrix Ξ (τ ) depends on τ .

Second, let us consider the case that there is no instantaneous correlation between the shocks driving the dynamics, meaning that Σ*<sup>x</sup>* and Σ*<sup>y</sup>* are diagonal matrices and Σ*x y* is the null matrix. Moreover, let us assume that *Kx* is diagonal, meaning that the spot price of a commodity reacts only to its deviation from the long run level and not to deviations of the other commodities. It follows that the first term in Eq. (8) is a diagonal matrix and the next three terms are null matrices. If, in addition, there is no cointegration in the system, meaning that Θ is the null matrix, then the last term in Eq. (8) is a diagonal matrix since ψ(τ ) is also a diagonal matrix. So, in this case, the variance–covariance matrix Ξ (τ ) is diagonal and, therefore, there is no correlation at any maturity. However, if there is at least one cointegration relation in the system, then the last term in Eq. (8) is no longer a diagonal matrix since ψ(τ ) is not diagonal. Therefore, cointegration induces correlation at various maturities although it was assumed there is no instantaneous correlation between the Brownian motions in the model.

# **3 Spread Option Prices**

In this section, we focus on futures prices and prices of European-style options written on the *spread* between two or more commodities, such as the difference between the price of electric power and the cost of the natural gas needed to produce it, or the price difference between crude oil and a basket of various refined products, known as the crack spread. The crack spread is in fact related to the profit margin that an oil refiner realizes when "cracking" crude oil while simultaneously selling the refined products in the wholesale market. The oil refiner can hedge the risk of losing profits by buying an appropriate number of futures contract on the crack spread or, alternatively, by buying call options of the crack spread. Since spread options have become regularly and widely used instruments in financial markets for hedging purposes, there is a growing need for a better understanding of the effects of cointegration on their prices.

There is extensive literature on approximation methods for spread and basket options on two (e.g. Kirk [8]) or more than two commodities, with recent contributions from Li et al. [9] and Caldana and Fusai [2]. However, mostly for simplicity, we relay in this chapter on the Monte-Carlo simulation method for pricing spread options written on two or more than two commodities.

From Eq. (7), it follows that **F**(*t*, *T* ) (conditional on information available up to time *s* ≤ *t* ≤ *T* ) is distributed as follows:

428 W. Farkas et al.

$$\mathbf{F}(t,T) \sim \log \mathcal{N}\left(\log \mathbf{F}(s,T) - \frac{1}{2} \int\_{s}^{t} \text{diag}(\boldsymbol{\Xi}(T-\boldsymbol{u}))d\boldsymbol{u}, \int\_{s}^{t} \boldsymbol{\Xi}(T-\boldsymbol{u})d\boldsymbol{u}\right), \tag{9}$$

where diag(*X*) denotes the vector containing the diagonal elements of the matrix *X*. Note that **F**(*s*, *T* ) can be either computed from (5) or observed from data.

The fact that the distribution function of **F**(*t*, *T* ) is known in an easy-to-use and analytic form is one of the key features of the model we employ. It allows us to simulate futures price curves at any time *t* in the future based on today's curves (time *s*) almost effortlessly. Hence, the price of a call option on the time-*T* value of a certain spread can be simply obtained by carrying out the following steps:


$$\mathbf{F}^{(m)} = \mathbf{F}(\mathbf{s}, T) \exp\left\{ \mathbf{e}^{(m)} \right\},$$

where *ε*(*m*) is generated from a multivariate normal distribution with mean −1 2 *T <sup>s</sup>* diag(Ξ (*<sup>T</sup>* <sup>−</sup> *<sup>u</sup>*))*du* and variance–covariance matrix *<sup>T</sup> <sup>s</sup>* Ξ (*T* − *u*)*du*1; (iii) compute the Monte-Carlo estimate of a call with strike *k* on the spread

$$\sum\_{n=1}^{N} \omega\_n S\_n(T) \qquad \left(= \sum\_{n=1}^{N} \omega\_n F\_n(T, T)\right).$$

with ω*n*, *n* = 1,..., *N* the weights of each component in the spread, as follows:

$$\frac{1}{M} \sum\_{m=1}^{M} \max \left\{ \left[ \sum\_{n=1}^{N} \omega\_n F\_n^{(m)} \right] - k, 0 \right\}. \tag{10}$$

For the sake of clarity we have set the risk-free rate curve equal to zero. We note that the random variables *ε*(*m*) can be simply re-used for pricing spread options with different maturity dates.

In the following we consider a system of three commodities2 characterized by one

$$\begin{aligned} \text{ionization relation with } \boldsymbol{\Theta} &= \begin{bmatrix} 1 & -0.4 & -0.6 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix}. \text{The rest of the parameters describing the dynamics are } K\_x = \begin{bmatrix} 1.5 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 0.5 \end{bmatrix}, \Sigma\_x = \begin{bmatrix} 0.0625 \ 0.0562 \ 0.0437 \\ 0.0562 \ 0.0900 \ 0.0262 \\ 0.0437 \ 0.0262 \ 0.1225 \end{bmatrix}, \mu\_y = \end{aligned}$$

<sup>1</sup>Here the technique of antithetic variables is used to reduce the number of random samples needed for a given level of accuracy.

<sup>2</sup>The structure of the parameters is chosen, in a parsimonious manner, taking into consideration the key facts of the empirical study conducted in Farkas et al. [6], where the results provide compelling evidence of cointegration between various commodities.

The Impact of Cointegration on Commodity Spread Options 429

$$\begin{bmatrix} 0.025\\ 0.025\\ 0.025 \end{bmatrix}, K\_{\mathbf{y}} = \begin{bmatrix} 1.5 \ 0 \ 0\\ 0 \ 0 \ 0\\ 0 \ 0 \ 0 \end{bmatrix}, \Sigma\_{\mathbf{y}} = \begin{bmatrix} 0.0225 & 0 & 0\\ 0 & 0.0225 & 0\\ 0 & 0 & 0.0225 \end{bmatrix}, \Sigma\_{\mathbf{x}\mathbf{y}} = \mathbf{O}\_3. \text{ Since } K\_{\mathbf{x}}$$

is diagonal, each spot price is error-corrected only with respect to deviations from its own long-run level. Moreover, given the specific form of the *Ky* matrix, deviations from the cointegration relationships between the long-run levels influence only the dynamics of the first spot price. In this respect, the second and third commodities are "exogenous" in that their dynamics is not influenced by the variables characterizing the other commodities. Regarding instantaneous dependence, the shocks driving the dynamics of the long-run factors are not correlated, whereas we imposed positive correlations between the shocks driving the dynamics of the **X**(*t*). More specifically, the instantaneous variance–covariance matrix Σ*<sup>y</sup>* for long-run shocks corresponds to an annual volatility of 0.15 for all three commodities. At the same time, the instantaneous variance–covariance matrix Σ*<sup>x</sup>* for short-run shocks corresponds to an annual volatility of 0.25 for the first commodity, of 0.30 for the second and of 0.35 for the third and to a correlation coefficient of 0.75 between the first and the second commodities, of 0.50 between the first and the last and of 0.25 between the second and the third. For simplicity, we also assume there is no correlation between the two categories of shocks. Since we focus on the impact of cointegration on spread options, in the following simulations we have set, for illustration purposes, the vector of risk premiums *λ<sup>x</sup>* and *λ<sup>y</sup>* and the risk-free rate curve equal to zero.3

Figure 3 depicts the term structure of correlation, over a period of 5 years, between the returns of futures prices of the three commodities in the system in two cases: the one when the cointegration relation is taken into account and, respectively, the one where the cointegration relation is abstracted from (i.e. Θ = **O**3).

One can observe that, regarding the correlation term structure between commodities 2 and 3, the two curves are identical (Fig. 3, bottom panel). This is not surprising since these two commodities are "exogenous" as explained above and their dynamics is not influenced by the cointegration relation. However, cointegration induces additional correlation when it comes to the commodities 1 and 2 and commodities 1 and 3, as also pointed out at the end of the previous section. In the absence of cointegration, the correlation vanishes after 2–3 years, whereas when the cointegration relation is taken into account the correlation exists also in the long run.

Next, we consider three spreads on two commodities, respectively *S*1(*t*) − *S*2(*t*), *S*1(*t*) − *S*3(*t*), *S*2(*t*) − *S*3(*t*), and one spread on all the three commodities in the system *S*1(*t*) − 0.5(*S*2(*t*) + *S*3(*t*)). We assume that at time 0, the two factors are such that **X**(0) = **Y**(0) = ⎡ ⎣ 2 2 2 ⎤ <sup>⎦</sup> and, therefore, the current spot prices of all four spreads

equal 0. We focus on studying the prices of the at-the-money (ATM) European-style call spread options with up to 5 years to maturity. Figure 4 shows the term structure of

<sup>3</sup>In a real-world application the parameters of the model can be estimated using futures prices data for the corresponding commodities. Given the features of the model one can implement an estimation procedure based on the Kalman filter.

**Fig. 3** Term structure of correlation, over a period of 5 years, between the futures log-returns of three commodities (from *top* to *bottom*: between 1 and 2, between 1 and 3, between 2 and 3)

**Fig. 4** Relative ATM call spread option prices with up to 5 years to maturity, and relative standard deviations of the spread distribution at maturities up to 5 years. *Top panel* for the spread *S*1(*t*) − *S*2(*t*). *Bottom panel* for the spread *S*1(*t*) − 0.5(*S*2(*t*) + *S*3(*t*))

**Fig. 5** The distribution of the spread at maturity (5 years). *Top panel* for the spread *S*1(*t*) − *S*2(*t*). *Bottom panel* for the spread *S*1(*t*) − 0.5(*S*2(*t*) + *S*3(*t*))

prices in the case with cointegration relative to the prices in the case the cointegration is not accounted for.4

Cointegration has a significant impact on spread option prices, with the price for the call with 5 years to maturity on the *S*1(*t*) − *S*2(*t*) spread being almost 30% lower in the case with cointegration and for the call on the *S*1(*t*) − 0.5(*S*2(*t*) + *S*3(*t*))spread being almost 60% lower. This can be explained by the fact that cointegration induces additional correlation that acts to lower the standard deviation of the distribution of the spread at maturity. To give a better grasp of this fact, Fig. 5 depicts the distribution of the spread at maturity in the two cases. We omitted from the figures the other two spreads, because the results for the *S*1(*t*) − *S*3(*t*) spread are similar to those for the *S*1(*t*) − *S*2(*t*) spread, and for the *S*2(*t*) − *S*3(*t*) spread there is, as expected given the "exogenous" nature of these two prices, no difference between the cases with and without cointegration.

If one were to add another cointegration relation to the system, linking the second and the third commodities in a long-run relationship, then the new cointegration relation would affect the prices of the options written on the *S*2(*t*) − *S*3(*t*) spread. Moreover, the new cointegration relation might also affect the option prices written on the other three spreads, the magnitude of this influence depending on the structure of the *Ky* matrix that captures the strength of responses in various spot prices to deviations in the new long-run relationship.

To have a better grasp of the influence of cointegration, next we run a series of sensitivity analyses concerning the existence of a second cointegration relationship in the system. To account for the new cointegration relation, we assume a new structure

<sup>4</sup>Relative quantities in Fig. 4 are determined as the ratio between the quantity computed with the model accounting for cointegration and the corresponding quantity computed with the model without cointegration.

**Fig. 6** The impact of *k*<sup>2</sup> and *k*<sup>3</sup> on the distribution of the spread *S*2(*t*) − *S*3(*t*) at maturity (5 years). *Top left panel* correlation between the futures log-returns of the two commodities in the basket. *Top right panel* relative standard deviations of the spread distribution (the values are normalized by division with the standard deviation in the case *k*<sup>2</sup> = *k*<sup>3</sup> = 0). *Bottom panel* the distribution for the two extreme cases in the analysis

$$\text{for } \Theta = \begin{bmatrix} 1 & -0.4 & -0.6 \\ -\theta & 1 & -0.8 \\ 0 & 0 & 0 \end{bmatrix} \text{ and } K\_y = \begin{bmatrix} 1.5 & k\_1 & 0 \\ 0 & k\_2 & 0 \\ 0 & -k\_3 & 0 \end{bmatrix}, \text{ where } \theta, k\_1, k\_2, k\_3 > 0. \text{ The}$$

other parameters have the same values as before. We first focus on the impact of the parameters *k*<sup>2</sup> and *k*<sup>3</sup> on the *S*2(*t*) − *S*3(*t*) spread. These two parameters quantify the strength that the second and, respectively, the third commodity reacts to deviations in the newly added cointegration relation. In the extreme case when both *k*<sup>2</sup> and *k*<sup>3</sup> are zero, we are in the same situation as before since the two commodities do not react to deviations. However, with the increase of these parameters the new cointegration relation will start to matter for the dynamics of the two commodities, and will have an impact on the distribution of the spread at maturity. Figure 6 presents the results of the sensitivity analysis when *k*<sup>2</sup> and *k*<sup>3</sup> are varied between 0 and 0.5, with the other parameters kept fixed at a level θ = 0.2 and *k*<sup>1</sup> = 0.

A higher value for the two reaction parameters produces a higher extra correlation induced by the second cointegration relation, which, in turn, is reflected in a lower standard deviation of the distribution of the spread at maturity. Over a 5-years horizon, the standard deviation for the case *k*<sup>2</sup> = *k*<sup>3</sup> = 0.5 is 32% lower than in the case the two parameters are equal to zero, and the ATM call price is 35% lower.

**Fig. 7** The impact of *k*<sup>1</sup> and θ on the distribution of the spread *S*1(*t*) − 0.5(*S*2(*t*) + *S*3(*t*)) at maturity (5 years). *Top left panel* correlation between the futures log-returns of the first commodity and the sum of the other two. *Top right panel* relative standard deviations of the spread distribution (the values are normalized by division with the standard deviation in the case *k*<sup>1</sup> = 0, θ = 0.2). *Bottom panel* the distribution for two specific cases in the analysis

Next, we focus on the impact of θ and *k*<sup>1</sup> on the *S*1(*t*) − 0.5(*S*2(*t*) + *S*3(*t*))spread. The parameter θ is a free variable that determines the second cointegration relationship and the parameter *k*<sup>1</sup> measures the magnitude of the response of the first commodity to deviations from the second cointegration relation. Figure 7 presents the results of the sensitivity analysis when *k*<sup>1</sup> and θ are varied between 0 and 1 and, respectively, between 0.1 and 0.3, with the other parameters kept fixed at a level *k*<sup>2</sup> = 0.25 and *k*<sup>3</sup> = 0.25. An increase of *k*<sup>1</sup> generates a reduction in the correlation between the components of the spread, showing that the second cointegration relationship has the effect of pulling the components of the spread away from each other. This effect is marginally stronger for the smaller θ. The result of the reduction in correlation is a higher standard deviation of the distribution of the spread at maturity.

For a maturity of 5 years, the standard deviation for the case *k*<sup>1</sup> = 1 is around 33% higher than in the case the parameter equals zero, and the ATM call price is about 40% higher. Therefore, the two cointegration relations influence the distribution of the *S*1(*t*) − 0.5(*S*2(*t*) + *S*3(*t*)) spread in different directions, the first one generating a reduction, and the second one an increase in the standard deviation. The overall impact depends on the magnitude of the parameters quantifying the responses of the commodities to deviations in the two cointegration relations.

# **4 Concluding Remarks**

In this work, we explored the implications of cointegration between commodity prices on the premiums of options written on various spreads between these commodities. We employed the continuous time model of cointegrated commodity prices developed in Farkas et al. [6] and conducted a simulation study for a cointegrated system of three commodities. We calculated the prices of several spread options and found that cointegration significantly influences these prices. Furthermore, we pointed out that cointegration leads to an upward sloping correlation term-structure which lowers the volatility of spreads and therefore it also lowers the value of options on spreads. Although we restricted in this chapter to a simulation study, it is worthwhile to mention that the model can also be estimated using futures prices on various commodities, as shown in Farkas et al. [6].

**Acknowledgements** The KPMG Center of Excellence in Risk Management is acknowledged for organizing the conference "Challenges in Derivatives Markets - Fixed Income Modeling, Valuation Adjustments, Risk Management, and Regulation".

**Open Access** This chapter is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, duplication, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, a link is provided to the Creative Commons license and any changes made are indicated.

The images or other third party material in this chapter are included in the work's Creative Commons license, unless indicated otherwise in the credit line; if such material is not included in the work's Creative Commons license and the respective action is not permitted by statutory regulation, users will need to obtain permission from the license holder to duplicate, adapt or reproduce the material.

# **References**


# **The Dynamic Correlation Model and Its Application to the Heston Model**

### **L. Teng, M. Ehrhardt and M. Günther**

**Abstract** Correlation plays an essential role in many problems of finance and economics, such as pricing financial products and hedging strategies, since it models the degree of relationship between, e.g., financial products and financial institutions. However, usually for simplicity the correlation coefficient is assumed to be a constant in many models, although financial quantities are correlated in a strongly nonlinear way in the real market. This work provides a new time-dependent correlation function, which can be easily used to construct dynamically (time-dependent) correlated Brownian motions and flexibly incorporated in many financial models. The aim of using our time-dependent correlation function is to reasonably choose additional parameters to increase the fitting quality on the one hand, but also add an economic concept on the other hand. As examples, we illustrate the applications of dynamic correlation in the Heston model. From our numerical results we conclude that the Heston model extended by incorporating time-dependent correlations can provide a better volatility smile than the pure Heston model.

**Keywords** Time-dependent correlations · Heston model · Implied volatility · Nonlinear dependence

# **1 Introduction**

Correlation is a well-established concept for quantifying interdependence. It plays an essential role in several problems of finance and economics, such as pricing financial

Lehrstuhl für Angewandte Mathematik und Numerische Analysis,

Fakultät Mathematik und Naturwissenschaften, Bergische Universität

Wuppertal, Gaußstr. 20, 42119 Wuppertal, Germany

e-mail: teng@math.uni-wuppertal.de

M. Ehrhardt e-mail: ehrhardt@math.uni-wuppertal.de

M. Günther e-mail: guenther@math.uni-wuppertal.de

L. Teng (B) · M. Ehrhardt · M. Günther

products and hedging strategies. For example, in [3] the arbitrage pricing model is based on that correlation as a measure for the dependence among the assets, and in portfolio credit models the default correlation is one fundamental factor of risk evaluation, see [1, 2, 12].

In most of the financial models, the correlation has been considered as a constant. However, this is not a realistic assumption due to the well-known fact that the correlation is hardly a fixed constant, see e.g. [7, 13]. For example, in many situations the pure Heston model [9] cannot provide enough skews or smiles in the implied volatility surface as market requires, especially for a short maturity. A reason for this might be that deterministically correlated Brownian motions (BMs) of the price process and the variance process are used, as the correlation mainly affects the slope of implied volatility smile. If the correlation is modeled as a time-dependent dynamic function, better skews or smiles will be provided in the implied volatility surface by reasonably choosing additional parameters. Furthermore, compared with the way to extend a model by using time-dependent parameter, e.g., [6, 10] for the Heston model, a time-dependent correlation function adds an economic concept (nonlinear relationship) and its application will be considerably simpler.

The key of modeling correlation as a time-dependent function is being able to ensure that the boundaries −1 and 1 of the correlation function are not attractive and unattainable for any time. In this work, we build up a appropriate time-dependent correlation function, so that one can reasonably choose additional parameters to increase the fitting quality on the one hand but also add an economic concept on the other hand.

The outline of the remaining part is as follows. Section 2 is devoted to a specific dynamic correlation function and its (analytical) computation. In Sect. 3, we present the concept of dynamically (time-dependent) correlated Brownian motions and the corresponding construction. The incorporation of our new dynamic correlation model in the Heston model is illustrated in Sect. 4. Finally, in Sect. 5 we conclude.

# **2 The Dynamic Correlation Function**

In this section we introduce a dynamic correlation function. Actually, it is in high demand to find such a correlation function which must satisfy the correlation properties: it provides only the values in the interval (−1, 1) for any time; it converges for increasing time. We find the following simple idea: we denote the dynamic correlation by ρ¯ and propose simply using

$$\bar{\rho}\_t := E\left[\tanh(X\_t)\right], \quad t > 0 \tag{1}$$

for the *dynamic correlation function,* where *Xt* is any mean-reverting process with positive and negative values. For the known parameters of *Xt*, the correlation function ρ¯*<sup>t</sup>* : [0, *t*] → (−1, 1) depends only on *t*. We observe that the dynamic correlation model (1) satisfies the desired properties: first, it is obvious that ρ¯*<sup>t</sup>* takes values only in (−1, 1)for all *t*. Besides, it converges for increasing time due to the mean reversion of the used process *Xt*.

Although we could intuitively observe that the function tanh is eminently suitable for transforming value to the interval (−1, 1), one might still ask whether other functions can also be applied for this purpose, like trigonometric functions or <sup>2</sup> <sup>π</sup> arctan( <sup>π</sup> <sup>2</sup> *x*). In theory, such functions could be used for this purpose. However, the problem is whether one can obtain the expectation of the transformed meanreverting process by such functions in a closed-form expression. Furthermore, our experiments show that the tendency of the function tanh is more suitable for modeling correlations, see [13].

*Xt* in (1) could be any mean-reverting process which allows positive and negative outcomes. As an example, let *Xt* be the *Ornstein–Uhlenbeck process* [14]

$$dX\_t = \kappa (\mu - X\_t)dt + \sigma dW\_t, \quad t \ge 0. \tag{2}$$

We are interested in computing *E*[ ¯ρ*t*] as a function of the given parameters in (2). We compute ρ¯*<sup>t</sup>* = *E*[tanh(*Xt*)] as

$$\bar{\rho}\_t = E[\tanh(X\_t)] = E\left[1 - \mathbf{e}^{-X\_t} \cdot \frac{2}{\mathbf{e}^{-X\_t} + \mathbf{e}^{X\_t}}\right] = 1 - E\left[\mathbf{e}^{-X\_t} \cdot \frac{1}{\cosh(X\_t)}\right]. \tag{3}$$

We set *g*(*Xt*) = 1/ cosh(*Xt*). Applying the results by Chen and Joslin [4], the expectation in (3) can be found in closed-form expression (up to an integral) as

$$\frac{1}{2\pi} \int\_{-\infty}^{\infty} \hat{\mathbf{g}}(\mu) \cdot E[\mathbf{e}^{-X\_t} \mathbf{e}^{iuX\_t}] \, d\mu,\tag{4}$$

where *<sup>i</sup>* <sup>=</sup> √−1 denotes the imaginary unit and *<sup>g</sup>*<sup>ˆ</sup> is the Fourier transform of *<sup>g</sup>*, in this case is known analytically by *<sup>g</sup>*ˆ(*u*) <sup>=</sup> π/ cosh( <sup>π</sup>*<sup>u</sup>* <sup>2</sup> ). Denoting *CF*(*t*, *u*|*X*0, κ, μ, σ ) as the characteristic function of *Xt*, the expectation in (4) can be presented by *CF*(*t*, *i* + *u*|*X*0, κ, μ, σ ). Thus, we obtain the closed-form expression for ρ¯*t*:

$$\bar{\rho}\_l = 1 - \frac{1}{2} \int\_{-\infty}^{\infty} \frac{1}{\cosh(\frac{\pi u}{2})} \cdot CF(t, i + \mu | \mathbf{X}\_0, \kappa, \mu, \sigma) du. \tag{5}$$

The next step is to calculate *CF*(*t*, *i* + *u*|*X*0, κ, μ, σ ). The process *Xt* is an Ornstein–Uhlenbeck process and its characteristic function *CF*(*t*, *u*|*X*0, κ, μ, σ ) can be obtained analytically, e.g. using the framework of the affine process, see [5]. Then, we only need to substitute *u* + *i* for *u* in the characteristic function of *Xt* to calculate *CF*(*t*, *i* + *u*|*X*0, κ, μ, σ ) which is given by

$$CF(t, i + \mu | X\_0, \kappa, \mu, \sigma) = \mathbf{e}^{-A(t) - \frac{B(t)}{2} + iu(A(t) + B(t)) + u^2 \frac{B(t)}{2}},\tag{6}$$

with

$$A(t) = \mathbf{e}^{-\kappa t} \mathbf{X}\_0 + \mu(1 - \mathbf{e}^{-\kappa t}), \quad B(t) = -\frac{\sigma^2}{2\kappa}(1 - \mathbf{e}^{-2\kappa t})\tag{7}$$

Finally, the dynamic correlation function ρ¯*<sup>t</sup>* can be computed by

$$\bar{\rho}\_l = 1 - \frac{\mathbf{e}^{-A(t) - \frac{B(t)}{2}}}{2} \int\_{-\infty}^{\infty} \frac{1}{\cosh(\frac{\pi u}{2})} \cdot \mathbf{e}^{iu(A(t) + B(t)) + u^2 \frac{B(t)}{2}} du,\tag{8}$$

where *A*(*t*) and *B*(*t*) are defined in (7). In fact, *X*<sup>0</sup> in *A*(*t*) is equal to artanh(ρ¯0).

To illustrate the role of each parameter in (8), we plot ρ¯*<sup>t</sup>* for several values of the parameters. First in Fig. 1, we let κ = 2 and σ = 0.5 and display ρ¯*<sup>t</sup>* with different values of μ, which is set to be 0.5, 0, and −0.5, respectively. Obviously, μ determines the long term mean of ρ¯*t*. However, μ is not the exact limiting value. Considering Fig. 1a where the initial value of the correlation function is 0, we see that ρ¯*<sup>t</sup>* is increasing to a value around μ = 0.5 and decreasing to a value around μ = −0.5 as *t* become larger, when μ = 0.5 and −0.5, respectively. Besides, for μ = ¯ρ<sup>0</sup> = 0 we observe that the correlation function ρ¯*<sup>t</sup>* yields always 0 which is the same as constant correlation ρ = 0. Now, we set ρ¯<sup>0</sup> = 0.3 and keep the value of all other parameters unchanged, then display the curves of ρ¯*<sup>t</sup>* in Fig. 1b.

Next, we fix κ = 2 and μ = 0.5 and then display ρ¯*<sup>t</sup>* for the varying σ = 0.5, 1 and 2 in Fig. 2. Obviously, σ shows the magnitude of variation from the transformed mean value of *Xt* (μ = 0.5). In Fig. 2a we see, the larger the value of σ is, the stronger the deviations of ρ¯*<sup>t</sup>* is from the transformed mean value of *Xt*. More interesting is that ρ¯*<sup>t</sup>* first decreases until *t* ≈ 0.25, then increases and converges to a value, see Fig. 2b where ρ¯<sup>0</sup> = 0.3 and σ = 2.

Again, in order to illustrate the role of κ, we set μ = 0.5, σ = 2 and vary the value of κ, see Fig. 3. From Fig. 3a it is easy to observe that κ represents the speed of ρ¯*<sup>t</sup>* tending to its limit. Especially, as we have seen in Fig. 2b, the curve is more

**Fig. 1** Dynamic correlation ρ¯*<sup>t</sup>* for varying μ (κ = 2 and σ = 0.5). **a** ρ¯<sup>0</sup> = 0. **b** ρ¯<sup>0</sup> = 0.3

**Fig. 2** Dynamic correlation ρ¯*<sup>t</sup>* for varying σ (κ = 2 and μ = 0.5). **a** ρ¯<sup>0</sup> = 0. **b** ρ¯<sup>0</sup> = 0.3

**Fig. 3** Dynamic correlation ρ¯*<sup>t</sup>* for varying κ (μ = 0.5 and σ = 2). **a** ρ¯<sup>0</sup> = 0. **b** ρ¯<sup>0</sup> = 0.3

unstable for κ = 2 and σ = 2 in Fig. 3b. However, if σ remains constant while the value of κ is increased, we can see that curves of ρ¯*<sup>t</sup>* become more stable and tend straightly to its limit. If one incorporates the dynamic correlation function (8) to a financial model, the parameter ρ¯0, κ, μ, and σ could be estimated by fitting the model to market data.

# **3 Dynamically Correlated BMs and Their Construction**

We fix a probability space (Ω, *F*, P) and an information filtration (*Ft*)*<sup>t</sup>*∈R<sup>+</sup> , satisfying the usual conditions, see e.g. [11]. At a time *t* > 0, the correlation coefficient of two Brownian motions (BMs) *W*<sup>1</sup> *<sup>t</sup>* and *W*<sup>2</sup> *<sup>t</sup>* is defined as

442 L. Teng et al.

$$
\rho\_t^{1,2} = \frac{E\left[W\_t^1 W\_t^2\right]}{t}.\tag{9}
$$

If we assume that ρ 1,2 *<sup>t</sup>* is constant, ρ 1,2 *<sup>t</sup>* = ρ1,<sup>2</sup> for all *t* > 0, say *W*<sup>1</sup> *<sup>t</sup>* and *W*<sup>2</sup> *<sup>t</sup>* are correlated with the constant ρ1,2.

Therefore, we give the definition of dynamically correlated BMs.

**Definition 1** Two Brownian motions *W*<sup>1</sup> *<sup>t</sup>* and *W*<sup>2</sup> *<sup>t</sup>* are called *dynamically correlated* with correlation function ρ*t*, if they satisfy

$$E\left[W\_t^1W\_t^2\right] = \int\_0^t \rho\_s ds,\tag{10}$$

where ρ*<sup>t</sup>* : [0, *t*] → [−1, 1]. The *average correlation* of *W*<sup>1</sup> *<sup>t</sup>* and *W*<sup>2</sup> *<sup>t</sup>* , ρ*<sup>A</sup>*v, is given by <sup>ρ</sup>*<sup>A</sup>*<sup>v</sup> := <sup>1</sup> *t t* <sup>0</sup> ρ*sds*.

We consider first the two-dimensional case and let ρ*<sup>t</sup>* be a correlation function. For two independent BMs *W*<sup>1</sup> *<sup>t</sup>* and *W*<sup>3</sup> *<sup>t</sup>* we define

$$\boldsymbol{W}\_t^2 = \int\_0^t \rho\_s d\boldsymbol{W}\_s^1 + \int\_0^t \sqrt{1 - \rho\_s^2} \, d\boldsymbol{W}\_s^3,\tag{11}$$

with the symbolic expression

$$d\boldsymbol{W}\_t^2 = \rho\_t d\boldsymbol{W}\_t^1 + \sqrt{1 - \rho\_t^2} \, d\boldsymbol{W}\_t^3. \tag{12}$$

It can be easily verified that *W*<sup>2</sup> *<sup>t</sup>* is a BM and correlated with *W*<sup>1</sup> *<sup>t</sup>* dynamically by ρ*t*. Besides, the covariance matrix and the average correlation matrix ofW*<sup>t</sup>* = (*W*<sup>1</sup> *<sup>t</sup>* , *W*<sup>2</sup> *t* ) can be determined, given by

$$\begin{pmatrix} \mathfrak{t} & \int\_0^t \rho\_s ds\\ \int\_0^t \rho\_s ds & t \end{pmatrix} \quad \text{and} \quad \begin{pmatrix} 1 & \frac{1}{t} \int\_0^t \rho\_s ds\\ \frac{1}{t} \int\_0^t \rho\_s ds & 1 \end{pmatrix},$$

respectively.

The construction above could be also generalized to *n*-dimensions. We denote a standard *n*-dimensional BM by Z*<sup>t</sup>* = (*Z*<sup>1</sup>,*<sup>t</sup>*,..., *Zn*,*<sup>t</sup>*) and the matrix of dynamic correlations *<sup>R</sup><sup>t</sup>* <sup>=</sup> (ρ*<sup>i</sup>*,*<sup>j</sup> <sup>t</sup>* )<sup>1</sup><*i*,*j*<*<sup>n</sup>* which has the Cholesky decomposition for each time *t*, *R<sup>t</sup>* = A*t*A *<sup>t</sup>* with A*<sup>t</sup>* = (*a i*,*j <sup>t</sup>* )<sup>1</sup><*i*,*j*<*<sup>n</sup>*. We define a new *n*-dimensional process W*<sup>t</sup>* = (*W*<sup>1</sup>,*<sup>t</sup>*,..., *Wn*,*<sup>t</sup>*) by

$$W\_{i,t} = \sum\_{j=1}^{n} a\_t^{ij} \, dZ\_{j,t}, \quad i = 1, \ldots, n. \tag{13}$$

We can easily verify that W*<sup>t</sup>* satisfies the following properties:


$$
\Sigma = \begin{pmatrix}
t - s & \int\_s^t \rho\_u^{1,2} du & \dots \int\_s^t \rho\_u^{1,n} du \\
\int\_s^t \rho\_u^{2,1} du & t - s & \dots \int\_s^t \rho\_u^{2,n} du \\
\vdots & \vdots & \ddots & \vdots \\
\int\_s^t \rho\_u^{n,1} du \int\_s^t \rho\_u^{n,2} du & \dots & t - s
\end{pmatrix}.
$$

We call the process (W*t*)*<sup>t</sup>*≥<sup>0</sup> *an n-dimensional dynamically correlated Brownian motion,* with the correlation matrix *Rt*.

# **4 Dynamic Correlation in the Heston Model**

As mentioned before, in many situations the pure Heston model has a limitation on reproducing properly a volatility smile. For this problem, several time-dependent Heston models have been proposed for good fitting to implied volatilities, e.g. [6] and [10]. In this section, we show how to incorporate our time-dependent correlation function into the Heston model.

# *4.1 Incorporating Dynamic Correlations*

Heston's stochastic volatility model is specified as

$$d\mathbf{S}\_t = \mu\_S \mathbf{S}\_t dt + \sqrt{\nu\_t} \mathbf{S}\_t dW\_t^S,\tag{14}$$

$$d\upsilon\_t = \kappa\_\upsilon (\mu\_\upsilon - \upsilon\_t)dt + \sigma\_\upsilon \sqrt{\upsilon\_t} \, dW\_t^\upsilon,\tag{15}$$

where (14) is the price of the spot asset, (15) is the volatility (variance) and *W<sup>S</sup> <sup>t</sup>* and *W*<sup>ν</sup> *t* are correlated with a constant correlation ρ*<sup>S</sup>*<sup>ν</sup> . To incorporate the time-dependent correlations, we assume that *dSt* and *d*ν*<sup>t</sup>* are correlated by a time-dependent correlation function ρ¯*<sup>t</sup>* instead of the constant correlation ρ*<sup>S</sup>*<sup>ν</sup> . The extended Heston model with dynamic correlation ρ¯ is specified as

$$dS\_t = \mu\_S S\_t dt + \sqrt{\nu\_t} S\_t \, dW\_t^1,\tag{16}$$

$$d\upsilon\_t = \kappa\_\upsilon (\mu\_\upsilon - \upsilon\_t) dt + \sigma\_\upsilon \sqrt{\upsilon\_t} \left( \bar{\rho}\_t dW\_t^1 + \sqrt{1 - \bar{\rho}\_t^2} \, dW\_t^2 \right), \tag{17}$$

where *W*<sup>1</sup> *<sup>t</sup>* and *W*<sup>2</sup> *<sup>t</sup>* are independent. Applying Itô's lemma and no-arbitrage arguments yields [9]

$$\begin{split} \frac{1}{2}\nu S^{2}\frac{\partial^{2}U}{\partial S^{2}} + \bar{\rho}\_{\nu}\sigma\_{\nu}\nu S\frac{\partial^{2}U}{\partial S\partial\nu} + \frac{1}{2}\sigma\_{\nu}^{2}\nu\frac{\partial^{2}U}{\partial\nu^{2}} + rS\frac{\partial U}{\partial S} \\ + [\kappa\_{\nu}(\mu\_{\nu}-\nu)-\tilde{\lambda}(S,\nu,\bar{\rho},t)\nu]\frac{\partial U}{\partial\nu} - rU + \frac{\partial U}{\partial t} = 0, \end{split} \tag{18}$$

where ρ¯*<sup>t</sup>* is defined in (8) but with the parameter ρ¯0, κρ, μρ, and νρ. It is worth mentioning that the market price of volatility risk depends also on the dynamic correlation, which could be written as λ(˜ *S*, ν, ρ¯*t*, *t*). This means, the price of correlation risk embedding in the price of volatility risk has been considered.

We consider, e.g. a European call option with strike price *K* and maturity *T* in the Heston model

$$C(S, \nu, t, \bar{\rho}\_l) = SP\_1 - KP(t, T)P\_2, \quad \tau = T - t,\tag{19}$$

where *P*(*t*, *T*) is the discount factor and both in-the-money probabilities *P*1, *P*<sup>2</sup> must satisfy the PDE (18) as well as their characteristic functions, *f*1(*St*, ν*t*, ρ¯*t*,φ, *t*) and *f*2(*St*, ν*t*, ρ¯*t*,φ, *t*)

$$f\_j(\mathbf{S}\_t, \boldsymbol{\nu}\_t, \boldsymbol{\bar{\rho}}\_t, \boldsymbol{\phi}, t) = E[\mathbf{e}^{i\boldsymbol{\phi}\ln S\_T} | \mathbf{S}\_t, \boldsymbol{\nu}\_t, \boldsymbol{\bar{\rho}}\_t] = \mathbf{e}^{C\_j(\mathbf{r}, \boldsymbol{\phi}) + D\_j(\mathbf{r}, \boldsymbol{\phi})\boldsymbol{\nu} + i\boldsymbol{\phi}\ln S\_t}, \quad j = 1, 2, \quad (20)$$

where *Cj*(0, φ) = 0 and *Dj*(0, φ) = 0. By substituting this functional form (20) into the PDE (18) we can obtain the following ordinary differential equations (ODEs) for the unknown functions *C* and *D*:

$$-\frac{1}{2}\phi^2 + \bar{\rho}\_l \sigma\_v \phi i D\_j + \frac{1}{2}\sigma\_v^2 D\_j^2 + u\_j \phi i - b\_j D\_j + \frac{\partial D\_j}{\partial t} = 0,\tag{21}$$

$$r\phi i + \kappa\_\upsilon \mu\_\upsilon D\_j + \frac{\partial C\_j}{\partial t} = 0,\qquad(22)$$

with the initial conditions *Cj*(0, φ) = *Dj*(0, φ) = 0

$$u\_1 = 0.5, \quad u\_2 = -0.5, \quad b\_1 = \kappa\_v + \lambda - \bar{\rho}\_l \sigma\_v \quad \text{and} \quad b\_2 = \kappa\_v + \lambda,\tag{23}$$

where

$$\bar{\rho}\_t = 1 - \frac{\mathbf{e}^{-A(t) - \frac{B(t)}{2}}}{2} \int\_{-\infty}^{\infty} \underbrace{\frac{1}{\cosh(\frac{\pi u}{2})} \cdot \mathbf{e}^{iu(A(t) + B(t)) + u^2 \frac{B(t)}{2}}}\_{:=\mathbf{g}(u)} du,\tag{24}$$

with *A*(*t*) = e−κρ *<sup>t</sup>* artanh(ρ¯0) + μρ(1 − e−κρ *<sup>t</sup>* ), *<sup>B</sup>*(*t*) = − <sup>σ</sup><sup>2</sup> ρ <sup>2</sup>κρ (<sup>1</sup> <sup>−</sup> <sup>e</sup>−2κρ *<sup>t</sup>* ).

Obviously, (21) and (22) cannot be solved analytically. Therefore, we need to find an efficient way to compute the option price numerically. We firstly generate the

**Fig. 4** *g*(*u*) under ρ<sup>0</sup> = 0.3, κρ = 2, μρ = −0.8, σρ = 0.1. **a** *t* = 0.1. **b** *t* = 10

dynamic correlations using (24). We observe that *g*(*u*) is a symmetric function about *u* = 0 and vanishes (approaches zero) for a sufficiently large absolute value of *u*, see Fig. 4. For these two reasons, the numerical integration in (24) is computationally fast. Next we use an explicit Runge–Kutta method, the matlab routine ode45, to obtain *C* and *D* in (21) and (22) and thus also the characteristic functions (20). Finally, we employ the COS method [8] to obtain the option price *C*(*S*, ν, *t*, ρ)¯ in (19). Thanks to the COS method, although we solved that ODE system numerically, the time for obtaining European option prices is less than 0.1 s so that a calibration can be performed. Obviously, the error consists of the error using ode45 for (21) and (22) and the error using COS method. The detailed analysis of error using COS method has been provided in [8].

# *4.2 Calibration of the Heston Model Under Dynamic Correlation*

In this section we calibrate the Heston model extended by our time-dependent correlation function to the real market data (Nikk300 index call options on July 16, 2012) and compare these to the pure Heston model [9] and the time-dependent Heston model [10].

We consider a set of *N* maturities *Ti*, *i* = 1,...,*N* and a set of *M* strikes *Kj*, *j* = 1,..., *M*. Then for each combination of maturity and strike we have a market price *V <sup>M</sup>* (*Ti*,*Kj*) = *V <sup>M</sup> ij* and a corresponding model price *V*(*Ti*,*Kj*; Θ) = *V* <sup>Θ</sup> *ij* generated by using (19). We choose the relative mean error sum of squares (RMSE) for the loss function <sup>1</sup> *M*×*N i*,*j* (*V <sup>M</sup> ij* <sup>−</sup>*<sup>V</sup>* <sup>Θ</sup> *ij* )<sup>2</sup> *V <sup>M</sup> ij* , which can be minimized to obtain the parameter estimates

$$
\hat{\Theta} = \arg\min \frac{1}{M \times N} \sum\_{i,j} \frac{(V\_{\vec{ij}}^M - V\_{\vec{ij}}^{\Theta})^2}{V\_{\vec{ij}}^M}. \tag{25}
$$

For the optimization we restrict ρ¯<sup>0</sup> to the interval (−1, 1) but not the value of μρ. Since it is not the direct limit of the correlation function but the mean reversion of the Ornstein–Uhlenbeck process, thus, it could take any value in R. Our experiments showed, that it is sufficient and appropriate to restrict μρ to the interval [−4, 4].

We state our estimated parameters and the estimation error for the pure Heston model (abbr. PH), the Heston model under our time-dependent correlations (CH), the time-dependent Heston model by Mikhailov and Ngel [10] (MN) in Tables 1, 2 and 3, respectively. We see that the estimation error using the CH model is significantly less than the error using the PH model and almost the same to the error (sum of errors for each maturity) under the MN model. To illustrate more clearly, for each maturity we compare the implied volatilities for all the models to the market volatilities in Fig. 5. We can observe that the implied volatilities for the CH model are much closer to the market volatilities than the implied volatilities for the PH model, especially the CH model has better volatility smile for the short maturity *T* = 1/12. Compared to the MN model, the implied volatilities for our model are almost the same. However, our CH model has an economic interpretation, namely the correlation is nonlinear

**Table 1** The estimated parameters for the pure Heston model using call options on the Nikk300 index on July 16, 2012 for the maturities 1/12, 1/4, 1/2, 1


**Table 2** The estimated parameters for the Heston model under time-dependent correlations using call options on the Nikk300 index on July 16, 2012 for the maturities 1/12, 1/4, 1/2, 1


The extended Heston model by using our time-dependent correlation function



The time-dependent Heston model by Mikhailov and Ngel

**Fig. 5** The comparison of implied volatilities for all the models to the market volatilities of the call options on the Nikk300 index on July 16, 2012, where the spot price is 150.9

and time-dependent as market requires. We conclude that the Heston model extended by incorporating our time-dependent correlations can provide better volatility smiles compared to the pure Heston model. The time-dependent correlation function can be easily and directly introduced into the financial models.

# **5 Conclusion**

In this work, we first investigated the dynamically (time-dependent) correlated Brownian motions and their construction. Furthermore, we proposed a new dynamic correlation function which can be easily incorporated into another financial model. The aim of using our dynamic correlation function is to reasonably choose additional parameters to increase the fitting quality on the one-hand side, but also add an economically meaningful perspective.

As an application, we incorporated our time-dependent correlation function into the Heston model. An experiment on estimation of the models using real market data has been provided. The numerical calibration results show that the Heston model extended by using our time-dependent correlation function provides better volatility smiles compared to the pure Heston model. Besides, this time-dependent correlation function could be easily and directly imposed to the financial models and thus it is preferred to use instead of a constant correlation.

**Acknowledgements** The authors acknowledge the much appreciated inspiration and in-depth discussions with Dr. Jörg Kienitz from Deloitte Düsseldorf, Germany.

The work was partially supported by the European Union in the FP7-PEOPLE-2012-ITN Program under Grant Agreement Number 304617 (FP7 Marie Curie Action, Project Multi-ITN STRIKE— Novel Methods in Computational Finance).

The KPMG Center of Excellence in Risk Management is acknowledged for organizing the conference "Challenges in Derivatives Markets—Fixed Income Modeling, Valuation Adjustments, Risk Management, and Regulation".

**Open Access** This chapter is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, duplication, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, a link is provided to the Creative Commons license and any changes made are indicated.

The images or other third party material in this chapter are included in the work's Creative Commons license, unless indicated otherwise in the credit line; if such material is not included in the work's Creative Commons license and the respective action is not permitted by statutory regulation, users will need to obtain permission from the license holder to duplicate, adapt or reproduce the material.

# **References**


The Dynamic Correlation Model and Its Application … 449

